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NOVEL METHODS OF CONSTRUCTING LIBRARIES 

COMPRISING DISPLAYED AND/OR EXPRESSED 
MEMBERS OF A DIVERSE FAMILY OF PEPTIDES, 
POLYPEPTIDES OR PROTEINS AND THE NOVEL LIBRARIES 

5 This application is a continuation-in-part of 

United States provisional application 60/198,069, filed 
April 17, 2000, a continuation-in-part of United States 
patent application 09/837,306, filed on April 17, 2001, 
a continuation-in-part of PCT application 

10 PCT/US01/12454, filed on April 17, 2001, a 

continuation-in-part of United States application 
10/000,516, filed on October 24, 2001 and a 
continuation-in-part of United States application 
10/045,674, filed on October 25, 2001. All of the 

15 earlier applications are specifically incorporated by 
reference herein. 

The present invention relates to libraries of 
genetic packages that display and/or express a member 
of a diverse family of peptides, polypeptides or 

20 proteins and collectively display and/or express at 

least a portion of the diversity of the family. In an 
alternative embodiment, the invention relates to 
libraries that include a member of a diverse family of 
peptides, polypeptides or proteins and collectively 

25 comprise at least a portion of the diversity of the 
family. In a preferred embodiment, the displayed 
and/or expressed polypeptides are human Fabs. 
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More specifically, the invention is directed 
to the methods of cleaving single-stranded nucleic 
acids at chosen locations, the cleaved nucleic acids 
encoding, at least in part, the peptides, polypeptides 
5 or proteins displayed on the genetic packages of, 
and/or expressed in, the libraries of the invention. 
In a preferred embodiment, the genetic packages are 
filamentous phage or phagemids or yeast. 

The present invention further relates to 
10 vectors for displaying and/or expressing a diverse 
family of peptides, polypeptides or proteins. 

_The present invention further relates to 
methods of screening the libraries of the invention and 
to the peptides, polypeptides and proteins identified 
15 by such screening. 

BACKGROUND OF THE INVENTION 

It is now common practice in the art to 
prepare libraries of genetic packages that display, 
express or comprise a member of a diverse family of 

20 peptides, polypeptides or proteins and collectively 

display, express or comprise at least a portion of the 
diversity of the family. In many common libraries, the 
peptides, polypeptides or proteins are related to 
antibodies. Often, they are Fabs or single chain 

25 antibodies. 

In general, the DNAs that encode members of 
the families to be displayed and/or expressed must be 
amplified before they are cloned and used to display 
and/or express the desired member. Such amplification 

30 typically makes use of forward and backward primers. 
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Such primers can be complementary to 
sequences native to the DNA to be amplified or 
complementary to oligonucleotides attached at the 5 1 or 
3' ends of that DNA. Primers that are complementary to 
5 sequences native to the DNA to be amplified are 
disadvantaged in that they bias the members of the 
families to be displayed. Only those members that 
contain a sequence in the native DNA that is 
substantially complementary to the primer will be 

10 amplified. Those that do not will be absent from the 
family. For those members that are amplified, any 
diversity within the primer region will be suppressed. 

For example/ in European patent 368,684 Bl, 
the primer that is used is at the 5' end of the V H 

15 region of an antibody gene. It anneals to a sequence 
region in the native DNA that is said to be 
"sufficiently well conserved" within a single species. 
Such primer will bias the members amplified to those 
having this "conserved" region. Any diversity within 

20 this region is extinguished. 

It is generally accepted that human antibody 
genes arise through a process that involves a 
combinatorial selection of V and J or V, D, and J 
followed by somatic mutations. Although most diversity 

25 occurs in the Complementary Determining Regions (CDRs), 
diversity also occurs in the more conserved Framework 
Regions (FRs) and at least some of this diversity 
confers or enhances specific binding to antigens (Ag) . 
As a consequence, libraries should contain as much of 

30 the CDR and FR diversity as possible. 

To clone the amplified DNAs of the peptides, 
polypeptides or proteins that they encode for display 
on a genetic package and/or for expression, the DNAs 
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must be cleaved to produce appropriate ends for 
ligation to a vector. Such cleavage is generally 
effected using restriction endonuclease recognition 
sites carried on the primers. When the primers are at 
5 the 5' end of DNA produced from reverse transcription 
of RNA, such restriction leaves deleterious 5 f 
untranslated regions in the amplified DNA. These 
regions interfere with expression of the cloned genes 
and thus the display of the peptides , polypeptides and 
10 proteins coded for by them. 

SUMMARY OF THE INVENTION 

It is an object of this invention to provide 
novel methods for constructing libraries that display, 
express or comprise a member of a diverse family of 
15 peptides, polypeptides or proteins and collectively 

display, express or comprise at least a portion of the 
diversity of the family. These methods are not biased 
toward DNAs that contain native sequences that are 
complementary to the primers used for amplification. 
20 They also enable any sequences that may be deleterious 
to expression to be removed from the amplified DNA 
before cloning and displaying and/or expressing. 

It is another object of this invention to 
provide a method for cleaving single-stranded nucleic 
25 acid sequences at a desired location, the method 
comprising the steps of: 

(i) contacting the nucleic acid with a 
single-stranded oligonucleotide, the 
oligonucleotide being functionally 
30 complementary to the nucleic acid in the 

region in which cleavage is desired and 
including a sequence that with its complement 
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in the nucleic acid forms a restriction 
endonuclease recognition site that on 
restriction results in cleavage of the 
nucleic acid at the desired location; and 
5 (ii) cleaving the nucleic acid solely at 

the recognition site formed by the 
complementation of the nucleic acid and the 
oligonucleotide; 

the contacting and the cleaving steps being performed 
10 at a temperature sufficient to maintain the nucleic 
acid in substantially single-stranded form, the 
oligonucleotide being functionally complementary to the 
nucleic acid over a large enough region to allow the 
two strands to associate such that cleavage may occur 
15 at the chosen temperature and at the desired location, 
and the cleavage being carried out using a restriction 
endonuclease that is active at the chosen temperature. 

It is a further object of this invention to 
provide an alternative method for cleaving single- 
20 stranded nucleic acid sequences at a desired location, 
the method comprising the steps of: 

(i) contacting the nucleic acid with a 
partially double-stranded oligonucleotide, 
the single-stranded region of the 

25 oligonucleotide being functionally 

complementary to the nucleic acid in the 
region in which cleavage is desired, and the 
double-stranded region of the oligonucleotide 
having a restriction endonuclease recognition 

30 site; and 

(ii) cleaving the nucleic acid solely at 
the cleavage site formed by the 
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complementation of the nucleic acid and the 
single-stranded region of the 
oligonucleotide; 

the contacting and the cleaving steps being performed 
5 at a temperature sufficient to maintain the nucleic 
acid in substantially single-stranded form, the 
oligonucleotide being functionally complementary to the 
nucleic acid over a large enough region to allow the 
two strands to associate such that cleavage may occur 

10 at the chosen temperature and at the desired location, 
and the cleavage being carried out using a restriction 
endonuclease that is active at the chosen temperature. 

In an alternative embodiment of this object 
of the invention, the restriction endonuclease 

15 recognition site is not initially located in the 

double-stranded part of the oligonucleotide. Instead, 
it is part of an amplification primer, which primer is 
complementary to the double-stranded region of the 
oligonucleotide. On amplification of the DNA-partially 

20 double-stranded combination, the restriction 

endonuclease recognition site carried on the primer 
becomes part of the DNA. It can then be used to cleave 
the DNA. 

Preferably, the restriction endonuclease 
25 recognition site is that of a Type II-S restriction 

endonuclease whose cleavage site is located at a known 
distance from its recognition site. 

It is another object of the present invention 
to provide a method of capturing DNA molecules that 
30 comprise a member of a diverse family of DNAs and 
collectively comprise at least a portion of the 
diversity of the family. These DNA molecules in 
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single-stranded form have been cleaved by one of the 
methods of this invention. This method involves 
ligating the individual single-stranded DNA members of 
the family to a partially duplex DNA complex. The 
5 method comprises the steps of: 

(i) contacting a single-stranded nucleic 
acid sequence that has been cleaved with a 
restriction endonuclease with a partially 
double-stranded oligonucleotide, the single- 
10 stranded region of the oligonucleotide being 

functionally complementary to the nucleic 
acid in the region that remains after 
cleavage, the double-stranded region of the 
oligonucleotide including any sequences 
15 necessary to return the sequences that remain 

after cleavage into proper reading frame for 
expression and containing a restriction 
endonuclease recognition site 5* of those 
sequences; and 
20 (ii) cleaving the partially double- 

stranded oligonucleotide sequence solely at 
the restriction endonuclease cleavage site 
contained within the double-stranded region 
of the partially double-stranded 
25 oligonucleotide. 

As before, in this object of the invention, 
the restriction endonuclease recognition site need not 
be located in the double-stranded portion of the 
oligonucleotide. Instead, it can be introduced on 
30 amplification with an amplification primer that is used 
to amplify the DNA-partially double-stranded 
oligonucleotide combination . 



WO 02/083872 



PCTYUS02/12405 



It is another object of this invention to 
prepare libraries, that display, express or comprise a 
diverse family of peptides, polypeptides or proteins 
and collectively display, express or comprise at least 
5 part of the diversity of the family, using the methods 
and DNAs described above. 

It is an object of this invention to screen 
those libraries to identify useful peptides, 
polypeptides and proteins and to use those substances 
10 in human therapy. 

Additional objects of the invention are 
reflected in claims 1-116. Each of these claims is - - - 
specifically incorporated by reference in this 
specification. 

15 BRIEF DESCRIPTION OF THE DRAWINGS 

FIG. 1 is a schematic of various methods that 
may be employed to amplify VH genes without using 
primers specific for VH sequences. 

20 FIG. 2 is a schematic of various methods that 

may be employed to amplify VL genes without using 
primers specific for VL sequences. 

FIG. 3 is a schematic of RACE amplification 
of antibody heavy and light chains. 

25 FIG. 4 depicts gel analysis of amplification 

products obtained after the primary PCR reaction from 4 
different patient samples. 

FIG. 5 depicts gel analysis of cleaved kappa 
DNA from Example 2. 

30 FIG. 6 depicts gel analysis of extender- 

cleaved kappa DNA from Example 2. 
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FIG. 7 depicts gel analysis of the PCR 
product from the extender-kappa amplification from 
Example 2. 

FIG. 8 depicts gel analysis of purified PCR 
5 product from the extender-kappa amplification from 
Example 2. 

FIG. 9 depicts gel analysis of cleaved and 
ligated kappa light chains from Example 2. 

FIG. 10 is a schematic of the design for CDR1 
10 and CDR2 synthetic diversity. 

FIG. 11 is a schemaitc of the cloning 
schedule for construction of the heavy chain 
repertoire. 

FIG. 12 is a schematic of the cleavage and 
15 ligation of the antibody light chain. 

FIG. 13 depicts gel analysis of cleaved and 
ligated lambda light chains from Example 4. 

FIG. 14 is a schematic of the cleavage and 
ligation of the antibody heavy chain. 
20 FIG. 15 depicts gel analysis of cleaved and 

ligated lambda light chains from Example 5. 

FIG. 16 is a schematic of a phage display 

vector. 

FIG. 17 is a schematic of a Fab cassette. 
25 FIG. 18 is a schematic of a process for 

incorporating fixed FR1 residues in an antibody lambda 
sequence. 

FIG. 19 is a schematic of a process for 
incorporating fixed FR1 residues in an antibody kappa 
30 sequence. 

FIG. 20 is a schematic of a process for 
incorporating fixed FR1 residues in an antibody heavy 
chain sequence. 
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TERMS 

In this application, the following terms and 
abbreviations are used: 



Sense strand 



The upper strand of ds DNA as 
usually written. In the sense 
strand, 5 1 -ATG-3 ' codes for 
Met. 



Antisense strand 



10 



The lower strand of ds DNA as 
usually written. In the 
antisense strand, 3' -TAOS' 
would correspond to a Met 
codon in the sense strand. 



Forward primer 



15 



A "forward" primer is 
complementary to a part of the 
sense strand and primes for 
synthesis of a new antisense- 
strand molecule. "Forward 
primer" and "lower-strand 
primer" are equivalent. 



20 Backward primer 



25 



A "backward" primer is 
complementary to a part of the 
antisense strand and primes 
for synthesis of a new sense- 
strand molecule. "Backward 
primer" and "top-strand 
primer" are equivalent. 
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Bases 

5 

Sv 
Ap 

10 ap* 
RERS 
RE 

15 

URE 

Functionally 
complementary 

20 

AA 
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Bases are specified either by 
their position in a vector or 
gene as their position within 
a gene by codon and base. For 
example, "89.1" is the first 
base of codon 89, 89.2 is the 
second base of codon 89, 

Streptavidin 

Ampicillin 

A gene conferring ampicillin 
resistance. 

Restriction endonuclease 
recognition site 

Restriction endonuclease - 
cleaves preferentially at RERS 

Universal restriction 
endonuclease 

Two sequences are sufficiently 
complementary so as to anneal 
under the chosen conditions. 

Amino acid 



PCR 



Polymerization chain reaction 
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Germline genes 

Antibody: an immunoglobin. 
The term also covers any 
protein having a binding 
domain which is homologous to 
an immunoglobin binding 
domain. A few examples of 
antibodies within this 
definition are, inter alia, 
immunoglobin isotypes and the 
Fab, F(ab 1 ) 2f _ scfv, Fv, dAb and 
Fd fragments. 

Two chain molecule comprising 
an Ab light chain and part of 
a heavy-chain. 

A single-chain Ab comprising 
either VH: : linker : :VL or 
VL: : linker : :VH 

Wild type 

Heavy chain 

Light chain 

A variable domain of a Kappa 
light chain. 

A variable domain of a heavy 
chain. 
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VL A variable domain of a lambda 

light chain. 

In this application when it is said that 
nucleic acids are cleaved solely at the cleavage site 
5 of a restriction endonuclease, it should be understood 
that minor cleavage may occur at random, e.g., at non- 
specific sites other than the specific cleavage site 
that is characteristic of the restriction endonuclease. 
The skilled worker will recognize that such non- 
10 specific, random cleavage is the usual occurrence. 
Accordingly, "solely at the cleavage site" of a 
restriction endonuclease means that cleavage occurs 
preferentially at the site characteristic of that 
endonuclease. 

15 As used in this application and claims, the 

term "cleavage site formed by the complementation of 
the nucleic acid and the single-stranded region of the 
oligonucleotide" includes cleavage sites formed by the 
single-stranded portion of the partially double- 

20 stranded ologonucleotide duplexing with the single- 
stranded DNA, cleavage sites in the double-stranded 
portion of the partially double-stranded 
oligonucleotide, and cleavage sites introduced by the 
amplification primer used to amplify the single- 

25 stranded DNA-partially double-stranded oligonucleotide 
combination. 

In the two methods of this invention for 
preparing single-stranded nucleic acid sequences, the 
first of those cleavage sites is preferred. In the 

30 methods of this invention for capturing diversity and 
cloning a family of diverse nucleic acid sequences, the 
latter two cleavage sites are preferred. 
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In this application, all references referred 
to are specifically incorporated by reference, 

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS 

The nucleic acid sequences that are useful in 
5 the methods of this invention, i.e., those that encode 
at least in part the individual peptides, polypeptides 
and proteins displayed, or expressed in or comprising 
the libraries of this invention, may be native, 
synthetic or a combination thereof. They may be mRNA, 
10 DNA or cDNA. In the preferred embodiment, the nucleic 
acids encode antibodies. Most- preferably, they encode 
Fabs. 

The nucleic acids useful in this invention 
may be naturally diverse, synthetic diversity may be 

15 introduced into those naturally diverse members, or the 
diversity may be entirely synthetic. For example, 
synthetic diversity can be introduced into one or more 
CDRs of antibody genes. Preferably, it is introduced 
into CDR1 and CDR2 of immunoglobulins. Preferably, 

20 natural diversity is captured in the CDR3 regions of 
the immunoglogin genes of this invention from B cells. 
Most preferably, the nucleic acids of this invention 
comprise a population of immunoglobin genes that 
comprise synthetic diversity in at least one, and more 

25 preferably both of the CDR1 and CDR2 and diversity in 
CDR3 captured from B cells. 

Synthetic diversity may be created, for 
example, through the use of TRIM technology (U.S. 
5,869,644). TRIM technology allows control over 

30 exactly which amino-acid types are allowed at 

variegated positions and in what proportions. In TRIM 
technology, codons to be diversified are synthesized 
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using mixtures of trinucleotides. This allows any set 
of amino acid types to be included in any proportion. 

Another alternative that may be used to 
generate diversified DNA is mixed oligonucleotide 
5 synthesis. With TRIM technology, one could allow Ala 
and Trp. With mixed oligonucleotide synthesis, a 
mixture that included Ala and Trp would also 
necessarily include Ser and Gly. The amino-acid types 
allowed at the variegated positions are picked with 

10 reference to the structure of antibodies, or other 

peptides, polypeptides or proteins of the family, the 
observed diversity in germline genes, the observed 
somatic mutations frequently observed, and the desired 
areas and types of variegation. 

15 In a preferred embodiment of this invention, 

the nucleic acid sequences for at least one CDR or 
other region of the peptides, polypeptides or proteins 
of the family are cDNAs produced by reverse 
transcription from mRNA. More preferably, the mRNAs 

20 are obtained from peripheral blood cells, bone marrow 
cells, spleen cells or lymph node cells (such as 
B-lymphocytes or plasma cells) that express members of 
naturally diverse sets of related genes. More 
preferable, the mRNAs encode a diverse family of 

25 antibodies. Most preferably, the mRNAs are obtained 
from patients suffering from at least one autoimmune 
disorder or cancer. Preferably, mRNAs containing a 
high diversity of autoimmune diseases, such as systemic 
lupus erythematosus, systemic sclerosis, rheumatoid 

30 arthritis, antiphospholipid syndrome and vasculitis are 
used. 

In a preferred embodiment of this invention, 
the cDNAs are produced from the mRNAs using reverse 
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transcription. In this preferred embodiment , the mRNAs 
are separated from the cell and degraded using standard 
methods, such that only the full length (i.e., capped) 
mRNAs remain. The cap is then removed and reverse 
5 transcription used to produce the cDNAs. 

The reverse transcription of the first 
(antisense) strand can be done in any manner with any 
suitable primer. See, e.g., HJ de Haard et al., 
Journal of Biological Chemistry , 274(26) :18218-30 

10 (1999). In the preferred embodiment of this invention 
where the mRNAs encode antibodies, primers that are 
complementary to the constant regions of antibody genes 
may be used. Those primers are useful because they do 
not generate bias toward subclasses of antibodies. In 

15 another embodiment, poly-dT primers may be used (and 
may be preferred for the heavy-chain genes) . 
Alternatively, sequences complementary to the primer 
may be attached to the termini of the antisense strand. 
In one preferred embodiment of this 

20 invention, the reverse transcriptase primer may be 
biotinylated, thus allowing the cDNA product to be 
immobilized on streptavidin (Sv) beads. Immobilization 
can also be effected using a primer labeled at the 5 f 
end with one of a) free amine group, b) thiol, c) 

25 carboxylic acid, or d) another group not found in DNA 
that can react to form a strong bond to a known partner 
on an insoluble medium. If f for example, a free amine 
(preferably primary amine) is provided at the 5 f end of 
a DNA primer, this amine can be reacted with carboxylic 

30 acid groups on a polymer bead using standard amide- 
forming chemistry. If such preferred immobilization is 
used during reverse transcription, the top strand RNA 
is degraded using well-known enzymes, such as a 
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combination of RNAseH and RNAseA, either before or 
after immobilization. 

The nucleic acid sequences useful in the 
methods of this invention are generally amplified 
5 before being used to display and/or express the 

peptides, polypeptides or proteins that they encode. 
Prior to amplification/ the single-stranded DNAs may be 
cleaved using either of the methods described before. 
Alternatively, the single-stranded DNAs may be 

10 amplified and then cleaved using one of those methods. 

Any of the well known methods for amplifying 
nucleic acid sequences may be used for such 
amplification. Methods that maximize, and do not bias, 
diversity are preferred. In a preferred embodiment of 

15 this invention where the nucleic acid sequences are 
derived from antibody genes, the present invention 
preferably utilizes primers in the constant regions of 
the heavy and light chain genes and primers to a 
synthetic sequence that are attached at the 5' end of 

20 the sense strand. Priming at such synthetic sequence 
avoids the use of sequences within the variable regions 
of the antibody genes. Those variable region priming 
sites generate bias against V genes that are either of 
rare subclasses or that have been mutated at the 

25 priming sites. This bias is partly due to suppression 
of diversity within the primer region and partly due to 
lack of priming when many mutations are present in the 
region complementary to the primer. The methods 
disclosed in this invention have the advantage of not 

30 biasing the population of amplified antibody genes for 
particular V gene types. 

The synthetic sequences may be attached to 
the 5 1 end of the DNA strand by various methods well 
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known for ligating DNA sequences together. RT 
CapExtention is one preferred method. 

In RT CapExtention (derived from Smart 
PCR (TM) ), a short overlap (5 1 - . . . GGG-3 1 in the upper- 

5 strand primer (USP-GGG) complements 3'-CCC 5* in the 

lower strand) and reverse transcriptases are used so 
that the reverse complement of the upper-strand primer 
is attached to the lower strand. 

FIGs. 1 and 2 show schematics to amplify VH 

10 and VL genes using RT CapExtention. FIG. 1 shows a 
schematic of the amplification of VH genes. FIG. 1, 
Panel A_shows a primer specific to the poly-dT region 
of the 3 f UTR priming synthesis of the first, lower 
strand. Primers that bind in the constant region are 

15 also suitable. Panel B shows the lower strand extended 
at its 3' end by three Cs that are not complementary to 
the mRNA. Panel C shows the result of annealing a 
synthetic top-strand primer ending in three GGGs that 
hybridize to the 3' terminal CCCs and extending the 

20 reverse transcription extending the lower strand by the 
reverse complement of the synthetic primer sequence. 
Panel D shows the result of PCR amplification using a 
5 f biotinylated synthetic top-strand primer that 
replicates the 5' end of the synthetic primer of panel 

25 C and a bottom-strand primer complementary to part of 
the constant domain. Panel E shows immobilized double- 
stranded (ds) cDNA obtained by using a 5 1 -biotinylated 
top-strand primer. 

FIG. 2 shows a similar schematic for 

30 amplification of VL genes. FIG. 2, Panel A shows a 
primer specific to the constant region at or near the 
3 f end priming synthesis of the first, lower strand. 
Primers that bind in the poly-dT region are also 
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its 3' end by three Cs that are not complementary to 
the mRNA. Panel C shows the result of annealing a 
synthetic top-strand primer ending in three GGGs that 
5 hybridize to the 3 1 terminal CCCs and extending the 

reverse transcription extending the lower strand by the 
reverse complement of the synthetic primer sequence. 
Panel D shows the result of PCR amplification using a 
5' biotinylated synthetic top-strand primer that 

10 replicates the 5 f end of the synthetic primer of panel 
C and a bottom-strand primer complementary to part of 
the constant domain. The bottom-strand primer also 
contains a useful restriction endonuclease site/ such 
as Ascl. Panel E shows immobilized ds cDNA obtained by 

15 using a 5 1 -biotinylated top-strand primer. 

In FIGs. 1 and 2, each V gene consists of a 
5' untranslated region (UTR) and a secretion signal, 
followed by the variable region, followed by a constant 
region, followed by a 3' untranslated region (which 

20 typically ends in poly-A) . An initial primer for 
reverse transcription may be complementary to the 
constant region or to the poly A segment of the 3'-UTR. 
For human heavy-chain genes, a primer of 15 T is 
preferred. Reverse transcriptases attach several C 

25 residues to the 3' end of the newly synthesized DNA. 
RT CapExtention exploits this feature. The reverse 
transcription reaction is first run with only a lower- 
strand primer. After about 1 hour, a primer ending in 
GGG (USP-GGG) and more RTase are added. This causes 

30 the lower-strand cDNA to be extended by the reverse 
complement of the USP-GGG up to the final GGG. Using 
one primer identical to part of the attached synthetic 
sequence and a second primer complementary to a region 
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of known sequence at the 3' end of the sense strand, 
all the V genes are amplified irrespective of their V 
gene subclass. 

In another preferred embodiment, synthetic 
5 sequences may be added by Rapid Amplification of cDNA 
Ends (RACE) (see Frohman, M.A., Dush, M.K., & Martin, 
G.R. (1988) Proc. Natl , Acad. Sci. USA (85): 
8998-9002) . 

FIG. 1 shows a schematic of RACE 

10 amplification of antibody heavy and light chains. 

First, mRNA is selected by treating total or poly(A+) 
_ _ RNA with calf intestinal phosphatase (CIP) to remove- - 
the 5 '-phosphate from all molecules that have them such 
as ribosomal RNA, fragmented mRNA, tRNA and genomic 

15 DNA. Full length mRNA (containing a protective 7- 

methyl cap structure) is uneffected. The RNA is then 
treated with tobacco acid pyrophosphatase (TAP) to 
remove the cap structure from full length mRNAs leaving 
a 5 ' -monophosphate group. Next, a synthetic RNA 

20 adaptor is ligated to the RNA population, only 

molecules which have a 5-phosphate (uncapped, full 
length mRNAs) will accept the adaptor. Reverse 
trascriptase reactions using an oligodT primer, and 
nested PCR (using one adaptor primer (located in the 5' 

25 synthetic adaptor) and one primer for the gene) are 
then used to amplify the desired transcript. 

In a preferred embodiment of this invention, 
the upper strand or lower strand primer may be also 
biotinylated or labeled at the 5 f end with one of a) 

30 free amino group, b) thiol, c) carboxylic acid and d) 
another group not found in DNA that can react to form a 
strong bond to a known partner as an insoluble medium. 
These can then be used to immobilize the labeled strand 
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after amplification. The immobilized DNA can be either 
single or double-stranded. 

After amplification (using e.g., RT 
CapExtension or RACE) , the DNAs of this invention are 
5 rendered single-stranded. For example, the strands can 
be separated by using a biotinylated primer, capturing 
the biotinylated product on streptavidin beads, 
denaturing the DNA, and washing away the complementary 
strand. Depending on which end of the captured DNA is 
10 wanted, one will choose to immobilize either the upper 
(sense) strand or the lower (antisense) strand. 

To prepare the single-stranded amplified DNAs 
for cloning into genetic packages so as to effect 
display of, or for expression of, the peptides, 
15 polypeptides or proteins encoded, at least in part, by 
those DNAs, they must be manipulated to provide ends 
suitable for cloning and display and/or expression. In 
particular, any 5* untranslated regions and mammalian 
signal sequences must be removed and replaced, in 
20 frame, by a suitable signal sequence that functions in 
the display or expression host. Additionally, parts of 
the variable domains (in antibody genes) may be removed 
and replaced by synthetic segments containing synthetic 
diversity. The diversity of other gene families may 
25 likewise be expanded with synthetic diversity. 

According to the methods of this invention, 
there are two ways to manipulate the single-stranded 
DNAs for display and/or expression. The first method 
comprises the steps of: 
30 (i) contacting the nucleic acid with a 

single-stranded oligonucleotide, the 
oligonucleotide being functionally 
complementary to the nucleic acid in the 
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region in which cleavage is desired and 
including a sequence that with its complement 
in the nucleic acid forms a restriction 
endonuclease recognition site that on 
5 restriction results in cleavage of the 

nucleic acid at the desired location; and 

<ii) cleaving the nucleic acid solely at 
the recognition site formed by the 
complementation of the nucleic acid and the 
10 oligonucleotide; 

the contacting and the cleaving steps being performed 
at a temperature sufficient to maintain the nucleic 
acid in substantially single-stranded form, the 
oligonucleotide being functionally complementary to the 

15 nucleic acid over a large enough region to allow the 
two strands to associate such that cleavage may occur 
at the chosen temperature and at the desired location, 
and the cleavage being carried out using a restriction 
endonuclease that is active at the chosen temperature. 

20 In this first method, short oligonucleotides 

are annealed to the single-stranded DNA so that 
restriction endonuclease recognition sites formed 
within the now locally double-stranded regions of the 
DNA can be cleaved. In particular, a recognition site 

25 that occurs at the same position in a substantial 
fraction of the single-stranded DNAs is identical. 

For antibody genes, this can be done using a 
catalog of germline sequences. See, e.g., 

"http://www.mrc-cpe.cam.ac.uk/imt-doc/restricted/ok.htm 
30 1." Updates can be obtained from this site under the 
heading "Amino acid and nucleotide sequence 
alignments." For other families, similar comparisons 
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exist and may be used to select appropriate regions for 
cleavage and to maintain diversity. 

For example, Table 1 depicts the DNA 
sequences of the FR3 regions of the 51 known human VH 
5 germline genes. In this region, the genes contain 
restriction endonuclease recognition sites shown in 
Table 2. Restriction endonucleases that cleave a large 
fraction of germline genes at the same site are 
preferred over endonucleases that cut at a variety of 

10 sites. Furthermore, it is preferred that there be only 
one site for the restriction endonucleases within the 
region to which the short oligonucleotide binds on the 
single-stranded DNA, e.g., about 10 bases on either 
side of the restriction endonuclease recognition site. 

15 An enzyme that cleaves downstream in FR3 is 

also more preferable because it captures fewer 
mutations in the framework. This may be advantageous 
is some cases. However, it is well known that 
framework mutations exist and confer and enhance 

20 antibody binding. The present invention, by choice of 
appropriate restriction site, allows all or part of FR3 
diversity to be captured. Hence, the method also 
allows extensive diversity to be captured. 

Finally, in the methods of this invention 

25 restriction endonucleases that are active between about 
37°C and about 75°C are used. Preferably, restriction 
endonucleases that are active between about 45°C and 
about 75°C may be used. More preferably, enzymes that 
are active above 50°C, and most preferably active about 

30 55°C, are used. Such temperatures maintain the nucleic 
acid sequence to be cleaved in substantially single- 
stranded form. 
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Enzymes shown in Table 2 that cut many of the 
heavy chain FR3 germline genes at a single position 
include: Maelll (24@4) , rsp45I (2104) , Hphl(4405), 
BsaJI (23065), AIuI (23047) , BlpI (21048), Ddel (29058), 
5 Bglll (10061), MslI (44072), BsiEI (23074 ) , Bael (23074 ) , 
BagI (23074), Haelll (25075) , Bst4CI (51086) , 
tfpyCH4III (51086), tfinfl(3802), Mlyl(1802), Plel(1802), 
Mnll (31067), HpyCH4V(21044) , BsmAI (16011) , Bpjnl ( 19012 ) , 
XmnI (12030), and Sad (11051). (The notation used 

10 means, for example, that BsjhAI cuts 16 of the FR3 
germline genes with a restriction endonuclease 
recognition site beginning at base 11 of FR3 . ) 

For cleavage of human heavy chains in FR3, 
the preferred restriction endonucleases are: Bst4CI (or 

15 Taal or ffpyCH4III), Bipl, tfpyCH4V, and Ms II. Because 
ACNGT (the restriction endonuclease recognition site 
for Bst4CI, Taal, and HpyCH4III) is found at a 
consistent site in all the human FR3 germline genes, 
one of those enzymes is the most preferred for capture 

20 of heavy chain CDR3 diversity. BlpI and HpyCH4V are 
complementary. Bipl cuts most members of the VH1 and 
VH4 families while HpyCH4V cuts most members of the 
VH3, VH5, VH6, and VH7 families. Neither enzyme cuts 
VH2s, but this is a very small family, containing only 

25 three members. Thus, these enzymes may also be used in 
preferred embodiments of the methods of this invention. 

The restriction endonucleases ifpyCH4III, 
Bst4CI, and Taal all recognize 5 , -ACnGT-3 l and cut 
upper strand DNA after n and lower strand DNA before 

30 the base complementary to n. This is the most 

preferred restriction endonuclease recognition site for 
this method on human heavy chains because it is found 
in all germline genes. Furthermore, the restriction 
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endonuclease recognition region (ACnGT) matches the 
second and third bases of a tyrosine codon (t ay ) and 
the following cysteine codon (tqy) as shown in Table 3. 
These codons are highly conserved, especially the 
5 cysteine in mature antibody genes. 

Table 4 E shows the distinct oligonucleotides 
of length 22 (except the last one which is of length 
20) bases. Table 5 C shows the analysis of 1617 actual 
heavy chain antibody genes. Of these, 1511 have the 

10 site and match one of the candidate oligonucleotides to 
within 4 mismatches. Eight oligonucleotides account 
for most of the matches and are given in Table 4 F.l. 
The 8 oligonucleotides are very similar so that it is 
likely that satisfactory cleavage will be achieved with 

15 only one oligonucleotide (such as H43 . 77 . 97 . 1-02#1) by 
adjusting temperature, pH, salinity, and the like. One 
or two oligonucleotides may likewise suffice whenever 
the germline gene sequences differ very little and 
especially if they differ very little close to the 

20 restriction endonuclease recognition region to be 
cleaved. Table 5 D shows a repeat analysis of 1617 
actual heavy chain antibody genes using only the 8 
chosen oligonucleotides. This shows that 14 63 of the 
sequences match at least one of the oligonucleotides to 

25 within 4 mismatches and have the site as expected. 
Only 7 sequences have a second J/pyCH4III restriction 
endonuclease recognition region in this region. 

Another illustration of choosing an 
appropriate restriction endonuclease recognition site 

30 involves cleavage in FR1 of human heavy chains. 
Cleavage in FR1 allows capture of the entire CDR 
diversity of the heavy chain. 
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The germline genes for human heavy chain FR1 
are shown in Table 6. Table 7 shows the restriction 
endonuclease recognition sites found in human germline 
genes FRls. The preferred sites are Bsgl (GTGCAG; 3904 ) , 
5 BsoFI (GCngc; 4306, 1109,203, 1012) , 
TseI(Gcwgc;4306, 1109,203,1012) , 

AfspAlI (CMGckg;4607,201) , PvuII (CAGctg; 4607, 201) , 
Alul (AGct;4808202) , Ddel (Ctnag; 22052, 9048) , 
HphI (tcacc; 22080) , BssKI (Nccngg; 35039, 2040) , 
10 BsaJI (Ccnngg; 32040, 2041) , BstNI (CCwgg; 33040) , 

ScrFI (CCngg; 35040, 2041) , EcoO109l (RGgnccy;22046, 
11043), Sau96l (Ggncc;23047, 11044) , 

Avail (Ggwcc;23047, 4044) , PpuMI (RGgwccy;22046, 4043) , 
BsmFI (gtccc; 20048) , Hinfl (Gantc; 34016, 21056, 21077) , 

15 Tfil (21077), Mlyl (GAGTC; 34016) , MIyl (gactc; 21056) , and 
AlwNI (CAGnnnctg; 22068 ) . The more preferred sites are 
MspAI and PvuII. MspAI and PvuII have 4 6 sites at 7-12 
and 2 at 1-6. To avoid cleavage at both sites, 
oligonucleotides are used that do not fully cover the 

20 site at 1-6. Thus, the DNA will not be cleaved at that 
site. We have shown that DNA that extends 3, 4, or 5 
bases beyond a PvuII-site can be cleaved efficiently. 

Another illustration of choosing an 
appropriate restriction endonuclease recognition site 

25 involves cleavage in FR1 of human kappa light chains. 
Table 8 shows the human kappa FR1 germline genes and 
Table 9 shows restriction endonuclease recognition 
sites that are found in a substantial number of human 
kappa FR1 germline genes at consistent locations. Of 

30 the restriction endonuclease recognition sites listed, 
BsmAI and PflFI are the most preferred enzymes. BsmAI 
sites are found at base 18 in 35 of 40 germl"ine genes. 
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PflFI sites are found in 35 of 40 germline genes at 
base 12. 

Another example of choosing an appropriate 
restriction endonuclease recognition site involves 
5 cleavage in FR1 of the human lambda light chain. Table 
10 shows the 31 known human lambda FR1 germline gene 
sequences. Table 11 shows restriction endonuclease 
recognition sites found in human lambda FR1 "germline 
genes. Hinfl and Ddel are the most preferred 

10 restriction endonucleases for cutting human lambda 
chains in FR1. 

After the appropriate site or sites for 
cleavage are chosen, one or more short oligonucleotides 
are prepared so as to functionally complement, alone or 

15 in combination, the chosen recognition site. The 

oligonucleotides also include sequences that flank the 
recognition site in the majority of the amplified 
genes. This flanking region allows the sequence to 
anneal to the single-stranded DNA sufficiently to allow 

20 cleavage by the restriction endonuclease specific for 
the site chosen. 

The actual length and sequence of the 
oligonucleotide depends on the recognition site and the 
conditions to be used for contacting and cleavage. The 

25 length must be sufficient so that the oligonucleotide 
is functionally complementary to the single-stranded 
DNA over a large enough region to allow the two strands 
to associate such that cleavage may occur at the chosen 
temperature and at the desired location. 

30 Typically, the oligonucleotides of this 

preferred method of the invention are about 17 to about 
30 nucleotides in length. Below about 17 bases, 
annealing is too weak and above 30 bases there can be a 
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loss of specificity. A preferred length is 18 to 24 
bases . 

Oligonucleotides of this length need not be 
identical complements of the germline genes. Rather, a 
5 few mismatches taken may be tolerated. Preferably , 

however, no more than 1-3 mismatches are allowed. Such 
mismatches do not adversely affect annealing of the 
oligonucleotide to the single-stranded DNA. Hence, the 
two DNAs are said to be functionally complementary. 
10 The second method to manipulate- the single- 

stranded DNAs of this invention for display and/or 
expression comprises the steps of: 

(i) contacting the nucleic acid with a 
partially double-stranded oligonucleotide, 

!5 the single-stranded region of the 

oligonucleotide being functionally 
complementary to the nucleic acid in the 
region in which cleavage is desired, and the 
double-stranded region of the oligonucleotide 

20 having a restriction endonuclease recognition 

site; and 

(ii) cleaving the nucleic acid solely at 
the cleavage site formed by the 
complementation of the nucleic acid and the 

25 single-stranded region of the 

oligonucleotide; 

the contacting and the cleaving steps being performed 
at a temperature sufficient to maintain the nucleic 
acid in substantially single-stranded form, the 
30 oligonucleotide being functionally complementary to the 
nucleic acid over a large enough region to allow the 
two strands to associate such that cleavage may occur 
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at the chosen temperature and at the desired location, 
and the cleavage being carried out using a restriction 
endonuclease that is active at the chosen temperature. 

As explained above, the cleavage site may be 
5 formed by the single-stranded portion of the partially 
double-stranded oligonucleotide duplexing with the 
single-stranded DNA, the cleavage site may be carried 
in the double-stranded portion of the partially double- 
stranded oligonucleotide, or the cleavage site may be 

10 introduced by the amplification primer used to amplify 
the single-stranded DNA-partially double-stranded 
oligonucleotide combination. In this embodiment, the 
first is preferred. And, the restriction endonuclease 
recognition site may be located in either the double- 

15 stranded portion of the oligonucleotide or introduced 
by the amplification primer, which is complementary to 
that double-stranded region, as used to amplify the 
combination. 

Preferably, the restriction endonuclease site 

20 is that of a Type II-S restriction endonuclease, whose 
cleavage site is located at a known distance from its 
recognition site. 

This second method, preferably, employs 
Universal Restriction Endonucleases ("URE"). UREs are 

25 partially double-stranded oligonucleotides. The 

single-stranded portion or overlap of the URE consists 
of a DNA adapter that is functionally complementary to 
the sequence to be cleaved in the single-stranded DNA. 
The double-stranded portion consists of a restriction 

30 endonuclease recognition site, preferably type II-S. 

The URE method of this invention is specific 
and precise and can tolerate some (e.g., 1-3) 
mismatches in the complementary regions, i.e., it is 
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functionally complementary to that region. Further, 
conditions under which the URE is used can be adjusted 
so that most of the genes that are amplified can be 
cut, reducing bias in the library produced from those 
5 genes. 

The sequence of the single-stranded DNA 
adapter or overlap portion of the URE typically 
consists of about 14-22 bases. However, longer or 
shorter adapters may be used. The size depends on the 

10 ability of the adapter to associate with its functional 
complement in the single-stranded DNA and the 
temperature used for contacting the URE and the single- 
stranded DNA at the temperature used for cleaving the 
DNA with the restriction enzyme. The adapter must be 

15 functionally complementary to the single-stranded DNA 
over a large enough region to allow the two strands to 
associate such that the cleavage may occur at the 
chosen temperature and at the desired location. We 
prefer singe-stranded or overlap portions of 14-17 

20 bases in length, and more preferably 18-20 bases in 
length. 

The site chosen for cleavage using the URE is 
preferably one that is substantially conserved in the 
family of amplified DNAs. As compared to the first 

25 cleavage method of this invention, these sites do not 
need to be endonuclease recognition sites. However, 
like the first method, the sites chosen can be 
synthetic rather than existing in the native DNA. Such 
sites may be chosen by references to the -sequences of 

30 known antibodies or other families of genes. For 
example, the sequences of many germline genes are 
reported at http: //www.mrc-cpe . cam. ac . uk/imt- 
doc/restricted/ok.html . For example, one preferred 
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site occurs near the end of FR3 — codon 89 through the 
second base of codon 93. CDR3 begins at codon 95. 

The sequences of 79 human heavy-chain genes 
are also available at 
5 http : //www . ncbi . nlm. nih . gov/entre2 /nucl eotide . html . 

This site can be used to identify appropriate sequences 
for URE cleavage according to the methods of this 
invention. See, e.g. , Table 12B. 

Most preferably, one or more sequences are 

10 identified using these sites or other available 

sequence information. These sequences together are 
present in a substantial fraction of the amplified 
DNAs. For example, multiple sequences could be used to 
allow for known diversity in germline genes or for 

15 frequent somatic mutations. Synthetic degenerate 
sequences could also be used. Preferably, a 
sequence (s) that occurs in at least 65% of genes 
examined with no more than 2-3 mismatches is chosen 

URE single-stranded adapters or overlaps are 

20 then made to be complementary to the chosen regions. 
Conditions for using the UREs are determined 
empirically. These conditions should allow cleavage of 
DNA that contains the functionally complementary 
sequences with no more than 2 or 3 mismatches but that 

25 do not allow cleavage of DNA lacking such sequences. 

As described above, the double-stranded 
portion of the URE includes an endonuclease recognition 
site, preferably a Type II-S recognition site. Any 
enzyme that is active at a temperature necessary to 

30 maintain the single-stranded DNA substantially in that 
form and to allow the single-stranded DNA adapter 
portion of the URE to anneal long enough to the single- 
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stranded DNA to permit cleavage at the desired site may 
be used. 

The preferred Type II-S enzymes for use in 
the URE methods of this invention provide asymmetrical 
5 cleavage of the single-stranded DNA. Among these are 
the enzymes listed in Table 13. The most preferred 
Type II-S enzyme is Fokl. 

When the preferred Fokl containing URE is 
used, several conditions are preferably used to effect 
10 cleavage: 

1) Excess of the URE over target DNA should be 
present to activate the enzyme. URE present 
only in equimolar amounts to the target DNA 
would yield poor cleavage of ssDNA because 

15 the amount of active enzyme available would 

be limiting. 

2) An activator may be used to activate part of 
the Fokl enzyme to dimerize without causing 
cleavage. Examples of appropriate activators 

20 are shown in Table 14. 

3) The cleavage reaction is performed at a 
temperature between 45°-75°C, preferably 
above 50 °C and most preferably above 55 °C. 

The UREs used in the prior art contained a 
25 14-base single-stranded segment, a 10-base stem 

(containing a Fokl site) , followed by the palindrome of 
the 10-base stem. While such UREs may be used in the 
methods of this invention, the preferred UREs of this 
invention also include a segment of three to eight 
30 bases (a loop) between the Fokl restriction 

endonuclease recognition site containing segments. In 
the preferred embodiment, the stem (containing the Fokl 
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site) and its palindrome are also longer than 10 bases. 
Preferably, they are 10-14 bases in length. Examples 
of these "lollipop" URE adapters are shown in Table 15. 
One example of using a URE to cleave an 
5 single-stranded DNA involves the FR3 region of human 
heavy chain. Table 16 shows an analysis of 840 full- 
length mature human heavy chains with the URE 
recognition sequences shown. The vast majority 
(718/840=0.85) will be recognized with 2 or fewer 

10 mismatches using five UREs (VHS881-1.1, VHS881-1.2, 
VHS881-2.1, VHS881-4.1, and VHS881-9.1). Each has a 
20-base adaptor sequence to complement the germline 
gene, a ten-base stem segment containing a Fokl site, a 
five base loop, and the reverse complement of the first 

15 stem segment. Annealing those adapters, alone or in 
combination, to single-stranded antisense heavy chain 
DNA and treating with Fokl in the presence of, e.g., 
the activator FOKIact, will lead to cleavage of the 
antisense strand at the position indicated. 

20 Another example of using a URE(s) to cleave a 

single-stranded DNA involves the FR1 region of the 
human Kappa light chains. Table 17 shows an analysis 
of 182 full-length human kappa chains for matching by 
the four 19-base probe sequences shown. Ninety-six 

25 percent of the sequences match one of the probes with 2 
or fewer mismatches. The URE adapters shown in Table 
17 are for cleavage of the sense strand of kappa 
chains. Thus, the adaptor sequences are the reverse 
complement of the germline gene sequences. The URE 

30 consists of a ten-base stem, a five base loop, the 

reverse complement of the stem and the complementation 
sequence. The loop shown here is TTGTT, but other 
sequences could be used. Its function is to interrupt 
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the palindrome of the stems so that formation of a 
ioiiypop monomer is favored over dimerization. Table 
17 also shows where the sense strand is cleaved. 

Another example of using a URE to cleave a 
5 single-stranded DNA involves the human lambda light 
chain. Table 18 shows analysis of 128 human lambda 
light chains for matching the four 19-base probes 
shown. With three or fewer mismatches, 88 of 128 (69%) 
of the chains match one of the probes. Table 18 also 

10 shows URE adapters corresponding to these probes. 
Annealing these adapters to upper-strand ssDNA of 
lambda chains and treatment with Fokl in the presence 
of FOKIact at a temperature at or above 45°C will lead 
to specific and precise cleavage of the chains. 

15 The conditions under which the short 

oligonucleotide sequences of the first method and the 
UREs of the second method are contacted with the 
single-stranded DNAs may be empirically determined. 
The conditions must be such that the single-stranded 

20 DNA remains in substantially single-stranded form. 

More particularly, the conditions must be such that the 
single-stranded DNA does not form loops that may 
interfere with its association with the oligonucleotide 
sequence or the URE or that may themselves provide 

25 sites for cleavage by the chosen restriction 
endonuclease. 

The effectiveness and specificity of short 
oligonucleotides (first method) and UREs (second 
method) can be adjusted by controlling the 

30 concentrations of the URE adapters/oligonucleotides and 
substrate DNA, the temperature, the pH, the 
concentration of metal ions, the ionic strength, the 
concentration of chaotropes (such as urea and 
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formamide) , the concentration of the restriction 
endonuclease (e.g. , Fokl) , and the time of the 
digestion. These conditions can be optimized with 
synthetic oligonucleotides having: 1) target germline 
5 gene sequences, 2) mutated target gene sequences, or 3) 
somewhat related non-target sequences. The goal is to 
cleave most of the target sequences and minimal amounts 
of non-targets. 

In accordance with this invention, the 

10 single-stranded DNA is maintained in substantially that 
form using a temperature between about 37 °C and about 
75°C. Preferably, a temperature between about 45°C and 
about 75°C is used. More preferably, a temperature 
between 50°C and 60°C, most preferably between 55°C and 

15 60°C, is used. These temperatures are employed both 

when contacting the DNA with the oligonucleotide or URE 
and when cleaving the DNA using the methods of this 
invention. 

The two cleavage methods of this invention 
20 have several advantages. The first method allows the 
individual members of the family of single-stranded 
DNAs to be cleaved preferentially at one substantially 
conserved endonuclease recognition site. The method 
also does not require an endonuclease recognition site 
25 to be built into the reverse transcription or 

amplification primers. Any native or synthetic site in 
the family can be used. 

The second method has both of these 
advantages. In addition, the preferred URE method 
30 allows the single-stranded DNAs to be cleaved at 
positions where no endonuclease recognition site 
naturally occurs or has been synthetically constructed. 
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Most importantly, both cleavage methods 
permit the use of 5 1 and 3 f primers so as to maximize 
diversity and then cleavage to remove unwanted or 
deleterious sequences before cloning, display and/or 
5 expression. 

After cleavage of the amplified DNAs using 
one of the methods of this invention, the DNA is 
prepared for cloning, display and/or expression. This 
is done by using a partially duplexed synthetic DNA 

10 adapter, whose terminal sequence is based on the 

specific cleavage site at which the amplified DNA has 
been cleaved. 

The synthetic DNA is designed such that when 
it is ligated to the cleaved single-stranded DNA in 

15 proper reading frame so that the desired peptide, 

polypeptide or protein can be displayed on the surface 
of the genetic package and/or expressed. Preferably, 
the double-stranded portion of the adapter comprises 
the sequence of several codons that encode the amino 

20 acid sequence characteristic of the family of peptides, 
polypeptides or proteins up to the cleavage site. For 
human heavy chains, the amino acids of the 3-23 
framework are preferably used to provide the sequences 
required for expression of the cleaved DNA. 

25 Preferably, the double-stranded portion of 

the adapter is about 12 to 100 bases in length. More 
preferably, about 20 to 100 bases are used. The 
double-standard region of the adapter also preferably 
contains at least one endonuclease recognition site 

30 useful for cloning the DNA into a suitable display 

and/or expression vector (or a recipient vector used to 
archive the diversity) . This endonuclease restriction 
site may be native to the germline gene sequences used 
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to extend the DNA sequence. It may be also constructed 
using degenerate sequences to the native germline gene 
sequences. Or, it may be wholly synthetic. 

The single-stranded portion of the adapter is 
5 complementary to the region of the cleavage in the 
single-stranded DNA. The overlap can be from about 2 
bases up to about 15 bases. The longer the overlap, 
the more efficient the ligation is likely to be. A 
preferred length for the overlap is 7 to 10. This 

10 allows some mismatches in the region so that diversity 
in this region may be captured. 

The single-stranded region or overlap of the 
partially duplexed adapter is advantageous because it 
allows DNA cleaved at the chosen site, but not other 

15 fragments to be captured. Such fragments would 

contaminate the library with genes encoding sequences 
that will not fold into proper antibodies and are 
likely to be non-specif ically sticky. 

One illustration of the use of a partially 

20 duplexed adaptor in the methods of this invention 

involves ligating such adaptor to a human FR3 region 
that has been cleaved, as described above, at S'-ACnGT- 
3' using HpyCH4III, Bst4CI or Taal. 

Table 4 F.2 shows the bottom strand of the 

25 double-stranded portion of the adaptor for ligation to 
the cleaved bottom-strand DNA. Since the HpyCH4III- 
Site is so far to the right (as shown in Table 3), a 
sequence that includes the Aflll-site as well as the 
Xbal site can be added. This bottom strand portion of 

30 the partially-duplexed adaptor, H43.XAExt, 

incorporates both Xbal and Aflll-sites. The top strand 
of the double-stranded portion of the adaptor has 
neither site (due to planned mismatches in the segments 
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opposite the Xbal and Aflll-Sites of H43.XAExt), but 
will anneal very tightly to H43.XAExt. H43AExt 
contains only the Aflll-site and is to be used with the 
top strands H43.ABrl and H43.ABr2 (which have 
5 intentional alterations to destroy the Ajflll-site) . 

After ligation, the desired, captured DNA can 
be PCR amplified again, if desired, using in the 
preferred embodiment a primer to the downstream 
constant region of the antibody gene and a primer to 

10 part of the double-standard region of the adapter. The 
primers may also carry restriction endonuclease sites 
for use in cloning the amplified DNA. 

After ligation, and perhaps amplification, of 
the partially double-stranded adapter to the single- 

15 stranded amplified DNA, the composite DNA is cleaved at 
chosen 5' and 3' endonuclease recognition sites. 

The cleavage sites useful for cloning depend 
on the phage or phagemid or other vectors into which 
the cassette will be inserted and the available sites 

20 in the antibody genes. Table 19 provides restriction 
endonuclease data for 75 human light chains. Table 20 
shows corresponding data for 79 human heavy chains. In 
each Table, the endonucleases are ordered by increasing 
frequency of cutting. In these Tables, Nch is the 

25 number of chains cut by the enzyme and Ns is the number 
of sites (some chains have more than one site). 

From this analysis, Sfil, AfotI, Aflll, ApaLI , 
and AscI are very suitable. Sfil and NotI are 
preferably used in pCESl to insert the heavy-chain 

30 display segment. ApaLI and AscI are preferably used in 
pCESl to insert the light-chain display segment. 

BstEII-sites occur in 97% of germ-line JH 
genes. In rearranged V genes, only 54/79 (68%) of 
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heavy-chain genes contain a BstEII-Site and 7/61 of 
these contain two sites. Thus, 47/79 (59%) contain a 
single BstEII-Site. An alternative to using BstEII is 
to cleave via UREs at the end of JH and ligate to a 
5 synthetic oligonucleotide that encodes part of CHI. 

One example of preparing a family of DNA 
sequences using the methods of this invention involves 
capturing human CDR 3 diversity. As described above, 
mRNAs from various autoimmune patients are reverse 

10 transcribed into lower strand cDNA. After the top 

strand RNA is degraded, the lower strand is immobilized 
and a short oligonucleotide used to cleave the cDNA 
upstream of CDR3. A partially duplexed synthetic DNA 
adapter is then annealed to the DNA and the DNA is 

15 amplified using a primer to the adapter and a primer to 
the constant region (after FR4) . The DNA is then 
cleaved using BstEII (in FR4) and a restriction 
endonuclease appropriate to the partially double- 
stranded adapter (e.g., Xbal and Aflll (in FR3) ) . The 

20 DNA is then ligated into a synthetic VH skeleton such 
as 3-23. 

One example of preparing a single-stranded 
DNA that was cleaved using the URE method involves the 
human Kappa chain. The cleavage site in the sense 

25 strand of this chain is depicted in Table 17. The 
oligonucleotide kapextURE is annealed to the 
oligonucleotides (kaBROlUR, kaBR02UR, kaBR03UR, and 
kaBR04UR) to form a partially duplex DNA. This DNA is 
then ligated to the cleaved soluble kappa chains. The 

30 ligation product is then amplified using primers 

kapextUREPCR and CKForeAsc (which inserts a AscI site 
after the end of C kappa) . This product is then 
cleaved with ApaLI and AscI and ligated to similarly 
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cut recipient vector* 

Another example involves the cleavage of 
lambda light chains, illustrated in Table 18. After 
cleavage, an extender (ON_LamExl33) and four bridge 

5 Oligonucleotides (0N_LamBl-133, ON_LamB2-133, ON_LaroB3-133, 

and ON_LamB4-i33) are annealed to form a partially duplex 
DNA. That DNA is ligated to the cleaved lambda-chain 
sense strands. After ligation, the DNA is amplified 
with ON_Lami33PCR and a forward primer specific to the 
10 lambda constant domain, such as CL2ForeAsc or 
CL7ForeAsc (Table 130). 

In human heavy chains, one can cleave almost 
all genes in FR4 (downstream, i.e., toward the 3' end 
of the sense strand, of CDR3) at a BstEII-Site that 
15 occurs at a constant position in a very large fraction 
of human heavy-chain V genes. One then needs a site in 
FR3, if only CDR3 diversity is to be captured, in FR2, ^ 
if CDR2 and CDR3 diversity is wanted, or in FR1, if all 
the CDR diversity is wanted. These sites are 
20 preferably inserted as part of the partially double- 
stranded adaptor. 

The preferred process of this invention is to 
provide recipient vectors (e.g., for display and/or 
expression) having sites that allow cloning of either 
25 light or heavy chains. Such vectors are well known and 
widely used in the art. A preferred phage display 
vector in accordance with this invention is phage 
MALIA3 . This displays in gene III. The sequence of 
the phage MALIA3 is shown in Table 21A (annotated) and 
30 Table 21B (condensed) . 

The DNA encoding the selected regions of the 
light or heavy chains can be transferred to the vectors 
using endonucleases that cut either light or heavy 
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chains only very rarely. For example, light chains may 
be captured with ApaLI and Ascl. Heavy-chain genes are 
preferably cloned into a recipient vector having Sfil, 
AZcol, Xbal, Aflll, BstEII, Apal, and NotI sites. The 
5 light chains are preferably moved into the library as 
ApaLI-AscI fragments. The heavy chains are preferably 
moved into the library as Sfil-NotI fragments. 

Most preferably, the display is had on the 
surface of a derivative of M13 phage. The most 

10 preferred vector contains all the genes of M13, an 
antibiotic resistance gene, and the display cassette. 
The preferred vector is provided with restriction sites 
that allow introduction and excision of members of the 
diverse family of genes, as cassettes. The preferred 

15 vector is stable against rearrangement under the growth 
conditions used to amplify phage. 

In another embodiment of this invention, the 
diversity captured by the methods of the present 
invention may be displayed and/or expressed in a 

20 phagemid vector (e.g., pCESl) that displays and/or 
expresses the peptide, polypeptide or protein. Such 
vectors may also be used to store the diversity for 
subsequent display and/or expression using other 
vectors or phage. 

25 In another embodiment of this invention, the 

diversity captured by the methods of the present 
invention may be displayed and/or expressed in a yeast 
vector. 

In another embodiment, the mode of display 
30 may be through a short linker to anchor domains — one 
possible anchor comprising the final portion of M13 III 
("Illstump") and a second possible anchor being the 
full length III mature protein. 
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The Illstump fragment contains enough of M13 
III to assemble into phage but not the domains involved 
in mediating infectivity. Because the w.t. Ill 
proteins are present the phage is unlikely to delete 
5 the antibody genes and phage that do delete these 
segments receive only a very small growth advantage. 
For each of the anchor domains, the DNA encodes the 
w.t. AA sequence, but differs from the w.t. DNA 
sequence to a very high extent. This will greatly 

10 reduce the potential for homologous recombination 
between the anchor and the w.t. gene that is also 
present (see Example 6) . 

Most preferably, the present invention uses a 
complete phage carrying an antibiotic-resistance gene 

15 (such as an ampicillin-resistance gene) and the display 
cassette. Because the w.t. iii and possibly viii genes 
are present, the w.t. proteins are also present. The 
display cassette is transcribed from a regulatable 
promoter (e.g., P LacZ ) . Use of a regulatable promoter 

20 allows control of the ratio of the fusion display gene 
to the corresponding w.t. coat protein. This ratio 
determines the average number of copies of the display 
fusion per phage (or phagemid) particle. 

Another aspect of the invention is a method 

25 of displaying peptides, polypeptides or proteins (and 
particularly Fabs) on filamentous phage. In the most 
preferred embodiment this method displays FABs and 
comprises: 

a) obtaining a cassette capturing a diversity of 

30 segments of DNA encoding the elements: 

P reg : :RBS1: :SS1: :VL: :CL: :stop: :RBS2: :SS2: :VH: :CH1: : 
linker: : anchor : :stop: : , 



WO 02/083872 



PCT/US02/12405 



- 43 - 

where P reg is a regulatable promoter, RBSl is a first 
ribosome binding site, SSI is a signal sequence 
operable in the host strain, VL is a member of a 
5 diverse set of light-chain variable regions, CL is a 
light-chain constant region, stop is one or more stop 
codons, RBS2 is a second ribosome binding site, SS2 is 
a second signal sequence operable in the host strain, 
VH is a member of a diverse set of heavy-chain variable 
10 regions, CHI is an antibody heavy-chain first constant 
domain, linker is a sequence of amino acids of one to 
about 50 residues, anchor is a protein that will 
assemble into the filamentous phage particle and stop 
is a second example of one or more stop codons; and 
15 b) positioning that cassette within the phage 

genome to maximize the viability of the phage 
and to minimize the potential for deletion of 
the cassette or parts thereof. 

20 The DNA encoding the anchor protein in the 

above preferred cassette should be designed to encode 
the same (or a closely related) amino acid sequence as 
is found in one of the coat proteins of the phage, but 
with a distinct DNA sequence. This is to prevent 

25 unwanted homologous recombination with the w.t. gene. 
In addition, the cassette should be placed in the 
intergenic region. The positioning and orientation of 
the display cassette can influence the behavior of the 
phage. 

30 In one embodiment of the invention, a 

transcription terminator may be placed after the second 
stop of the display cassette above (e.g., Trp) . This 
will reduce interaction between the display cassette 



WO 02/083872 



PCT/US02/12405 



- 44 - 

and other genes in the phage antibody display vector. 

In another embodiment of the methods of this 
invention, the phage or phagemid can display and/or 
express proteins other than Fab, by replacing the Fab 
5 portions indicated above, with other protein genes. 

Various hosts can be used the display and/or 
expression aspect of this invention. Such hosts are 
well known in the art. In the preferred embodiment, 
where Fabs are being displayed and/or expressed, the 

10 preferred host should grow at 30 °C and be RecA" (to 
reduce unwanted genetic recombination) and EndA~ (to 
make recovery of RF DNA easier). It is also preferred 
that the host strain be easily transformed by 
electroporation . 

15 XLl-Blue MRF' satisfies most of these 

preferences, but does not grow well at 30°C. XLl-Blue 
MRF 1 does grow slowly at 38 °C and thus is an acceptable 
host. TG-1 is also an acceptable host although it is 
RecA + and EndA*. XLl-Blue MRF' is more preferred for 

20 the intermediate host used to accumulate diversity 
prior to final construction of the library. 

After display and/or expression, the 
libraries of this invention may be screened using well 
known and conventionally used techniques. The selected 

25 peptides, polypeptides or proteins may then be used to 
treat disease. Generally, the peptides, polypeptides 
or proteins for use in therapy or in pharmaceutical 
compositions are produced by isolating the DNA encoding 
the desired peptide, polypeptide or protein from the 

30 member of the library selected. That DNA is then used 
in conventional methods to produce the peptide, 
polypeptides or protein it encodes in appropriate host 
cells, preferably mammalian host cells, e.g., CHO 
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cells. After isolation, the peptide, polypeptide or 
protein is used alone or with pharmaceutically 
acceptable compositions in therapy to treat disease. 

EXAMPLES 

5 Example 1 : RACE amplification of heavy and light chain 
antibody repertoires from autoimmune patients. 

Total RNA was isolated from individual blood 
samples (50 ml) of 11 patients using a RNAzolTM kit 
(CINNA/Biotecx) , as described by the manufacturer. The 
10 patients were diagnosed as follows: 

1. SLE and phospholipid syndrome 

2. limited systemic sclerosis 

3. SLE and Sjogren syndrome 

4. Limited Systemic sclerosis 

15 5. Reumatoid Arthritis with active vasculitis 

6. Limited systemic sclerosis and Sjogren Syndrome 

7. Reumatoid Artritis and (not active) vasculitis 

8. SLE and Sjogren syndrome 

9. SLE 

20 10. SLE and (active) glomerulonephritis 
11. Polyarthritis/ Raynauds Phenomen 

From these 11 samples of total RNA, Poly-A+ RNA was 
isolated using Promega PolyATtract® mRNA Isolation kit 
(Promega) . 

25 250 ng of each poly-A+ RNA sample was used to 

amplify antibody heavy and light chains with the 
GeneRAacerTM kit (Invitrogen cat no. L1500-01). A 
schematic overview of the RACE procedure is shown in 
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FIG. 3. 

Using the general protocol of the GeneRAacer™ 
kit, an RNA adaptor was ligated to the 5' end of all 
mRNAs . Next f a reverse transcriptase reaction was 
5 performed in the presence of oligo(dT15) specific 

primer under conditions described by the manufacturer 
in the GeneRAacer" kit. 

1/5 of the cDNA from the reverse 
transcriptase reaction was used in a 20 ul PCR 
10 reaction. For amplification of the heavy chain IgM 

repertoire, a forward primer based on the CHI chain of 
. IgM [HuCmFOR] and _a backward primer based on the _ _ 
ligated synthetic adaptor sequence [5'A] were used. 
(See Table 22) 

15 For amplification of the kappa and lambda 

light chains, a forward primer that contains the 3' 
coding-end of the cDNA [HuCkFor and HuCLFor2+HuCLf or7] 
and a backward primer based on the ligated synthetic 
adapter sequence [5 ! A] was used (See Table 22). 

20 Specific amplification products after 30 cycles of 
primary PCR were obtained. 

FIG. 4 shows the amplification products 
obtained after the primary PCR reaction from 4 
different patient samples. 8 ul primary PCR product 

25 from 4 different patients was analyzed on a agarose gel 
[labeled 1,2, 3 and 4]. For the heavy chain, a product 
of approximately 950 nt is obtained while for the kappa 
and lambda light chains the product is approximately 
850 nt. Ml-2 are molecular weight markers. 

30 PCR products were also analyzed by DNA 

sequencing [10 clones from the lambda, kappa or heavy 
chain repertoires] . All sequenced antibody genes 
recovered contained the full coding sequence as well as 
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the 5' leader sequence and the V gene diversity was the 
expected diversity (compared to literature data) . 

50 ng of all samples from all 11 individual 
amplified samples were mixed for heavy, lambda light or 
5 kappa light chains and used in secondary PCR reactions . 

In all secondary PCRs approximately 1 ng 
template DNA from the primary PCR mixture was used in 
multiple 50 ul PCR reactions [25 cycles] . 

For the heavy chain, a nested biotinylated • 

10 forward primer [HuCm-Nested] was' used, and a nested 
5' end backward primer located in the synthetic 
adapter-sequence [5'NA] was used. The 5' end 
lower-strand of the heavy chain was biotinylated. 

For the light chains, a 5' end biotinylated 

15 nested primer in the synthetic adapter was used [5'NA] 
in combination with a 3' end primer in the constant 
region of Ckappa and Clambda, extended with a sequence 
coding for the AscI restriction site [ kappa: 
HuCkForAscI, Lambda: HuCL2-FOR-ASC + HuCL7-FOR-ASC] . 

20 [5' end Top strand DNA was biotinylated]. After 

gel-analysis the secondary PCR products were pooled and 
purified with Promega Wizzard PCR cleanup. 
Approximately 25 ug biotinylated heavy chain, lambda 
and kappa light chain DNA was isolated from the 11 

25 patients. 

Example 2 : Capturing kappa chains with BsmAI . 

A repertoire of human-kappa chain mRNAs was 
prepared using the RACE method of Example 1 from a 
30 collection of patients having various autoimmune 
diseases . 
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This Example followed the protocol of Example 
1. Approximately 2 micrograms (ug) of human kappa- 
chain (Igkappa) gene RACE material with biotin attached 
to 5' -end of upper strand was immobilized as in Example 
5 1 on 200 microliters (uL) of Seradyn magnetic beads* 
The lower strand was removed by washing the DNA with 2 
aliquots 200 jiL of 0-1 M NaOH (pH 13) for 3 minutes for 
the first aliquot followed by 30 seconds for the second 
aliquot. The beads were neutralized with 200 \iL of 10 

10 mM Tris (pH 7.5) 100 mM NaCl. The short 

oligonucleotides shown in Table 23 were added in 40 
_fold molar excess in 100 yL of NEB buffer 2 (50 mM _ _ 
NaCl, 10 mM Tris-HCl, 10 mM MgCl 2 , 1 mM dithiothreitol 
pH 7.9) to the dry beads. The mixture was incubated at 

15 95°C for 5 minutes then cooled down to 55°C over 30 

minutes. Excess oligonucleotide was washed away with 2 
washes of NEB buffer 3 (100 mM NaCl, 50 mM Tris-HCl, 10 
mM MgCl 2 , 1 mM dithiothreitol pH 7.9). Ten units of 
BsmAI (NEB) were added in NEB buffer 3 and incubated 

20 for 1 h at 55°C. The cleaved downstream DNA was 

collected and purified over a Qiagen PCR purification 
column (FIGs. 5 and 6). 

FIG. 5 shows an analysis of digested kappa 
single-stranded DNA. Approximately 151.5 pmol of 

25 adapter was annealed to 3.79 pmol of immobilized kappa 
single-stranded DNA followed by digestion with 15 U of 
BsmAI. The supernatant containing the desired DNA was 
removed and analyzed by 5% polyacrylamide gel along 
with the remaining beads which contained uncleaved full 

30 length kappa DNA. 189 pmol of cleaved single-stranded 
DNA was purified for further analysis. Five percent of 
the original full length ssDNA remained on the beads. 
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FIG, 6 shows an analysis of the extender - 
cleaved kappa ligation. 180 pmol of pre-annealed 
bridge/extender was ligated to 1.8 pmol of BsmAI 
digested single-stranded DNA. The ligated DNA was 
5 purified by Qiagen PCR purification column and analyzed 
on a 5% polyacrylamide gel. Results indicated that the 
ligation of extender to single-stranded DNA was 95% 
efficient. 

A partially double-stranded adaptor was 

10 prepared using the oligonucleotide shown in Table 23. 
The adaptor was added to the single-stranded DNA in 100 
fold molar excess along with 1000 units of T4 DNA 
ligase and incubated overnight at 16°C. The excess 
oligonucleotide was removed with a Qiagen PCR 

15 purification column. The ligated material was 

amplified by PCR using the primers kapPCRtl and kapfor 
shown in Table 23 for 10 cycles with the program shown 
in Table 24. 

The soluble PCR product was run on a gel and 

20 showed a band of approximately 700 n, as expected 
(FIGs. 7 and 8). The DNA was cleaved with enzymes 
ApaLI and AscI, gel purified, and ligated to similarly 
cleaved vector pCESl. 

FIG. 7 shows an analysis of the PCR product 

25 from the extender-kappa amplification. Ligated 

extender-kappa single-stranded DNA was amplified with 
primers specific to the extender and to the constant 
region of the light chain. Two different template 
concentrations, 10 ng versus 50 ng, were used as 

30 template and 13 cycles were used to generate 

approximately 1.5 ug of dsDNA as shown by 0.8% agarose 
gel analysis. 
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FIG. 8 shows an analysis of the purified PCR 
product from the extender-kappa amplification. 
Approximately 5 ug of PCR amplified extender-kappa 
double-stranded DNA was run out on a 0.8% agarose gel, 
5 cut out, and extracted with a GFX gel purification 

column. By gel analysis, 3.5 ug of double-stranded DNA 
was prepared • 

The assay for capturing kappa chains with 
BsmAl was repeated and produced similar results. 

10 FIG 9A shows the DNA after it was cleaved and collected 
and purified over a Qiagen PCR purification column. 
FIG* 9B shows the partially double-stranded adaptor 
ligated to the single-stranded DNA. This ligated 
material was then amplified (FIG. 9C) . The gel showed 

15 a band of approximately 700 n. 

Table 25 shows the DNA sequence of a kappa 
light chain captured by this procedure. Table 26 shows 
a second sequence captured by this procedure. The 
closest bridge sequence was complementary to the 

20 sequence 5 ' -agccacc-3 ■ , but the sequence captured reads 
5'-Tgccacc-3 f , showing that some mismatch in the 
overlapped region is tolerated. 

Example 3: Construction of Synthetic CDR1 and CDR2 
Diversity in V-3-23 VH Framework. 

25 Synthetic diversity in Complementary 

Determinant Region (CDR) 1 and 2 was created in the 3- 
23 VH framework in a two step process: first, a vector 
containing the 3-23 VH framework was constructed; and 
then, a synthetic CDR 1 and 2 was assembled and cloned 

30 into this vector. 
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For construction of the 3-23 VH framework, 8 
oligonucleotides and two PGR primers (long 
Oligonucleotides - TOPFR1A, BOTFR1B, BOTFR2, BOTFR3, F06, 
BOTFR4 , ON-vgCl, and ON-vgC2 and primers - SFPRMET and 
5 BOTPCRPRIM, shown in Table 27) that overlap were 
designed based on the Genebank sequence of 3-23 VH 
framework region. The design incorporated at least one 
useful restriction site in each framework region, as 
shown in Table 27, In Table 27, the segments that were 

10 synthesized are shown as bold, the overlapping regions 
are underscored, and the PCR priming regions at each 
end are underscored . 

A mixture of these 8 oligos was combined at a 
final concentration of 2.5uM in a 20ul PCR reaction. 

15 The PCR mixture contained 200uM dNTPs, 2.5mM MgCl 2 , 

0.02U Pfu Turbo™ DNA Polymerase, 1U Qiagen HotStart Taq 
DNA Polymerase, and IX Qiagen PCR buffer. The PCR 
program consisted of 10 cycles of 94 °C for 30s, 55°C 
for 30s, and 72°C for 30s. 

20 The assembled 3-23 VH DNA sequence was then 

amplified, using 2.5ul of a 10-fold dilution from the 
initial PCR in lOOul PCR reaction. The PCR reaction 
contained 200uM dNTPs, 2.5mM MgCl 2 , 0.02U Pfu Turbo™ 
DNA Polymerase, 1U Qiagen HotStart Taq DNA Polymerase, 

25 IX Qiagen PCR Buffer and 2 outside primers (SFPRMET and 
BOTPCRPRIM) at a concentration of luM. The PCR program 
consisted of 23 cycles at 94°C for 30s, 55°C for 30s, 
and 72 °C for 60s. The 3-23 VH DNA sequence was 
digested and cloned into pCESl (phagemid vector) using 

30 the Sfil and BstEII restriction endonuclease sites. 

All restriction enzymes mentioned herein were supplied 
by New England BioLabs, Beverly, MA and used as per the 
manufacturer's instructions. 
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Stuffer sequences (shown in Table 28 and 
Table 29) were introduced into pCESl to replace 
CDR1/CDR2 sequences (900 bases between BspEI and Xbal 
RE sites) and CDR3 sequences (358 bases between Aflll 
5 and Bs tEII) prior to cloning the CDR1/CDR2 diversity. 
This new vector was termed pCES5 and its sequence is 
given in Table 29. 

Having stuffers in place of the CDRs avoids 
the risk that a parental sequence would be over- 

10 represented in the library. The stuffer sequences are 
fragments from the penicillase gene of E. coli. The 
CDR1-2 stuffer contains restriction sites for Bglll, 
Bsu36I, Bell, Xcml, Mlul, Pvull, Hpal, and Hindi, the 
underscored sites being unique within the vector pCES5. 

15 The stuffer that replaces CDR3 contains the unique 
restriction endonuclease site RsrII. 

A schematic representation of the design for 
CDR1 and CDR2 synthetic diversity is shown FIG. 10. 
The design was based on the presence of mutations in 

20 DP47/3-23 and related germline genes. Diversity was 
designed to be introduced at the positions within CDR1 
and CDR2 indicated by the numbers in FIG. 10. The 
diversity at each position was chosen to be one of the 
three following schemes: 1 = ADE FGH I KLMN PQRST VW Y ; 2 - 

25 YRWVGS; 3 = PS, in which letters encode equimolar mixes 
of the indicated amino acids. 

For the construction of the CDR1 and CDR2 
diversity, 4 overlapping oligonucleotides (ON-vgCl, 
ON_Brl2, ON_CD2Xba, and 0N-vgC2, shown in Table 27 and 

30 Table 30) encoding CDR1/2, plus flanking regions, were 
designed. A mixture of these 4 oligos was combined at 
a final concentration of 2.5uM in a 40ul PCR reaction. 
Two of the 4 oligos contained variegated sequences 
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positioned at the CDR1 and the CDR2. The PCR mixture 
contained 200uM dNTPs, 2.5U Pwo DNA Polymerase (Roche) , 
and IX Pwo PCR buffer with 2mM MgS0 4 . The PCR program 
consisted of 10 cycles at 94°C for 30s, 60°C for 30s, 
5 and 72 °C for 60s. This assembled CDR1/2 DNA sequence 
was amplified, using 2.5ul of the mixture in lOOul PCR 
reaction. The PCR reaction contained 200uM dNTPs, 2.5U 
Pwo DNA Polymerase, IX Pwo PCR Buffer with 2mM MgS0 4 and 
2 outside primers at a concentration of luM. The PCR 

10 program consisted of 10 cycles at 94 °C for 30s, 60°C 
for 30s, and 72°C for 60s. These variegated sequences 
were digested and cloned into the 3-23 VH framework in 
place of the CDR1/2 stuffer. 

We obtained approximately 7 X 10 7 independent 

15 transformants . CDR3 diversity either from donor 

populations or from synthetic DNA can be cloned into 
the vector containing synthetic CDR1 and CDR 2 
diversity. 

A schematic representation of this procedure 
20 is shown in FIG. 11. A sequence encoding the FR- 

regions of the human V3-23 gene segment and CDR regions 
with synthetic diversity was made by oligonucleotide 
assembly and cloning via BspEl and Xbal sites into a 
vector that complements the FR1 and FR3 regions. Into 
25 this library of synthetic VH segments, the 

complementary VH-CDR3 sequence (top right) was cloned 
via Xbal an BstEll sites. The resulting cloned CH 
genes contain a combination of designed synthetic 
diversity and natural diversity (see FIG. 11). 
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Example 4: Cleavage and ligation of the lambda light 
chains with Hinf I . 

A schematic of the cleavage and ligation of 
antibody light chains is shown in FIGs. 12A and 12B. 
5 Approximately 2 ug of biotinylated human Lambda DNA 
prepared as described in Example 1 was immobilized on 
200 ul Seradyn magnetic beads. The lower strand was 
removed by incubation of the DNA with 200 ul of 0.1 M 
NaOH (pH=13) for 3 minutes, the supernatant was removed 

10 and an additional washing of 30 seconds with 200 ul of 
0.1 M NaOH was performed. Supernatant was removed and 
the beads were neutralized with 200 ul of 10 mM Tris 
(pH=7.5), 100 mM NaCl. 2 additional washes with 200 ul 
NEB2 buffer 2, containing 10 mM Tris <pH=7.9), 50 mM 

15 NaCl, 10 mM MgC12 and 1 mM dithiothreitol, were 

performed. After immobilization, the amount of ssDNA 
was estimated on a 5% PAGE-UREA gel. 

About 0.8 ug ssDNA was recovered and 
incubated in 100 ul NEB2 buffer 2 containing 80 molar 

20 fold excess of an equimolar mix of ON_LamlaB7, 

ON_Lam2aB7, ON_Lam31B7 and ON_Lam3rB7 [each oligo in ' 
20 fold molar excess] (see Table 31) . 

The mixture was incubated at 95° C for 5 
minutes and then slowly cooled down to 50° C over a 

25 period of 30 minutes. Excess of oligonucleotide was 
washed away with 2 washes of 200 ul of NEB buffer 2. 
4 U/ug of Hinf I was added and incubated for 1 hour at 
50° C. Beads were mixed every 10 minutes. 

After incubation the sample was purified over 

30 a Qiagen PCR purification column and was subsequently 
analysed on a 5% PAGE-urea gel (see FIG. 13A, cleavage 
was more than 70% efficient) . 
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A schematic of the ligation of the cleaved 
light chains is shown in FIG. 12B. A mix of 
bridge/extender pairs was prepared from the Brg/Ext 
oligo's listed in Table 31 (total molar excess 100 
5 fold) in 1000 U of T4 DNA Ligase (NEB) and incubated 
overnight at 16° C. After ligation of the DNA, the 
excess oligonucleotide was removed with a Qiagen PCR 
purification column and ligation was checked on a 
Urea-PAGE gel (see FIG, 13B; ligation was more than 95% 

10 efficient). 

Multiple PCRs were performed containing 10 ng 
of the ligated material in an 50 ul PCR reaction using 
25 pMol ON lamPlePCR and 25 pmol of an equimolar mix 
of Hu-CL2AscI/HuCL7AscI primer (see Example 1) . 

15 PCR was performed at 60° C for 15 cycles 

using Pfu polymerase. About 1 ug of dsDNA was recovered 
per PCR (see FIG. 13C) and cleaved with ApaLl and AscI 
for cloning the lambda light chains in pCES2. 

Example 5: Capture of human heavy-chain CDR3 
20 population. 

A schematic of the cleavage and ligation of 
antibody light chains is shown in FIGs. 14A and 14B. 

Approximately 3 ug of human heavy-chain (IgM) 

25 gene RACE material with biotin attached to 5 1 -end of 
lower strand was immobilized on 300 uL of Seradyn 
magnetic beads. The upper strand was removed by 
washing the DNA with 2 aliquots 300 uL of 0.1 M NaOH 
(pH 13) for 3 minutes for the first aliquot followed by 

30 30 seconds for the second aliquot. The beads were 

neutralized with 300 uL of 10 mM Tris (pH 7.5) 100 mM 
NaCl. The REdaptors (oligonucleotides used to make 
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single-stranded DNA locally double-stranded) shown in 
Table 32 were added in 30 fold molar excess in 200 uL 
of NEB buffer 4 (50 mM Potasium Acetate, 20 mM 
Tris-Acetate, 10 mM Magnesuim Acetate, 1 mM 
5 dithiothreitol pH 7.9) to the dry beads. The 

REadaptors were incubated with the single-stranded DNA 
at 80 °C for 5 minutes then cooled down to 55 °C over 
30 minutes. Excess REdaptors were washed away with 2 
washes of NEB buffer 4. Fifteen units of HpyCH4III 

10 (NEB) were added in NEB buffer 4 and incubated for 1 

hour at 55 °C. The cleaved downstream DNA remaining on 
. the beads was. removed from the beads using a Qiagen 
Nucleotide removal column (see FIG. 15). 

The Bridge/Extender pairs shown in Table 33 

15 were added in 25 molar excess along with 1200 units of 
T4 DNA ligase and incubated overnight at 16 °C. Excess 
Bridge/Extender was removed with a Qiagen PCR 
purification column. The ligated material was 
amplified by PCR using primers H43 .XAExtPCR2 and 

20 Hucumnest shown in Table 34 for 10 cycles with the 
program shown in Table 35. 

The soluble PCR product was run on a gel and 
showed a band of approximately 500 n, as expected (see 
FIG. 15B) . The DNA was cleaved with enzymes Sfil and 

25 NotI, gel purified, and ligated to similarly cleaved 
vector PCES1. 

Example 6: Description of Phage Display Vector CJKA05, 
a member of the library built in vector DY3F7 . 

Table 36 contains an annotated DNA sequence 
30 of a member of the library, CJRA05, see FIG. 16. Table 
36 is to be read as follows: on each line everything 
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that follows an exclamation mark " ! " is a comment. All 
occurrences of A, C, G, and T before " ! " are the DNA 
sequence. Case is used only to show that certain bases 
constitute special features, such as restriction sites, 
5 ribosome binding sites, and the like, which are labeled 
below the DNA. CJRA05 is a derivative of phage DY3F7, 
obtained by cloning an ApaLI to NotI fragment into 
these sites in DY3F31. DY3F31 is like DY3F7 except 
that the light chain and heavy chain genes have been 
10 replaced by "stuffer" DNA that does not code for any 
antibody. DY3F7 contains an antibody that binds 
streptavidin, but did not come from the present 
library. 

The phage genes start with gene ii and 
15 continue with genes x, v, vii, ix, viii, iii, vi, i, 
and iv. Gene iii has been slightly modified in that 
eight codons have been inserted between the signal 
sequence and the mature protein and the final amino 
acids of the signal sequence have been altered. This 
20 allows restriction enzyme recognition sites EagI and 
Xbal to be present. Following gene iv is the phage 
origin of replication (ori) . After ori is bla which 
confers resistance to ampicillin (ApR) . The phage 
genes and bla are transcribed in the same sense. 
25 After bla, is the Fab cassette (illustrated 

in FIG. 17) comprising: 

a) PlacZ promoter, 

b) A first Ribosome Binding Site (RBS1), 

c) The signal sequence form M13 iii, 
30 d) An ApaLI RERS, 

e) A light chain (a kappa L20::JK1 shortened by one 
codon at the V-J boundary in this case) , 

f) An AscI RERS, 
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g) A second Ribosome Binding Site (RBS2) , 

h) A signal sequence, preferably PelB, which 
contains, 

i) An Sfil RERS, 

5 j) A synthetic 3-23 V region with diversity in CDR1 
and CDR2, 
k) A captured CDR3, 

1) A partially synthetic J region (FR4 after BstEII), 
m) CHI, 
10 n) A NotI RERS, 
o) A His6 tag, 
p) A cMyc tag, 
q) An amber codon, 

r) An anchor DNA that encodes the same amino-acid 
15 sequence as codons 273 to 424 of M13 iii (as shown in 
Table 37) . 
s) Two stop codons, 
t) An Avrll RERS, and 
u) A trp terminator. 
20 The anchor (item r) encodes the same 

amino-acid sequence as do codons 273 to 424 of M13 iii 
but the DNA is approximately as different as possible 
from the wild-type DNA sequence. In Table 36, the 
III' stump runs from base 8997 to base 94 55. Below the 
25 DNA, as comments, are the differences with wild-type 
iii for the comparable codons with "!W.T" at the ends 
of these lines. Note that Met and Trp have only a 
single codon and must be left as is. These AA types 
are rare. Ser codons can be changed at all three base, 
30 while Leu and Arg codons can be changed at two. 

In most cases, one base change can be 
introduced per codon. This has three advantages: 1) 
recombination with the wild-type gene carried elsewhere 
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on the phage is less likely, 2) new restriction sites 
can be introduced, facilitating construction; and 3) 
sequencing primers that bind in only one of the two 
regions can be designed. 
5 The fragment of M13 III shown in CJRA05 is 

the preferred length for the anchor segment. 
Alternative longer or shorter anchor segments defined 
by reference to whole mature III protein may also be 
utilized. 

10 The sequence of M13 III consists of the 

following elements: Signal Sequence :: Domain 1 
(Dl):: Linker 1 (LI):: Domain 2 (D2) : : Linker 2 
(L2) : : Domain 3 (D3) : : Transmembrane Segment (TM) : : 
Intracellular anchor (IC) (see Table 38). 

15 The pill anchor (also known as trpIII) 

preferably consists of D2 : : L2 : : D3 : : TM: : IC . Another 
embodiment for the pill anchor consists of 
D2 1 : :L2: :D3: :TM: : IC (where D2 1 comprises the last 21 
residues of D2 with the first 109 residues deleted) . A 

20 further embodiment of the pill anchor consists of 

D2 1 ( OS ) : : L2 : : D3 : : TM : : IC (where D2 , (C>S) is D2' with 
the single C converted to S), and d) D3::TM::IC. 

Table 38 shows a gene fragment comprising the 
NotI site, His6 tag, cMyc tag, an amber codon, a 

25 recombinant enterokinase cleavage site, and the whole 
of mature M13 III protein. The DNA used to encode this 
sequence is intentionally very different from the DNA 
of wild-type gene iii as shown by the lines denoted 
"W.T." containing the w.t. bases where these differ 

30 from this gene. Ill is divided into domains denoted 

"domain 1", "linker 1", "domain 2", "linker 2", "domain 
3", "transmembrane segment", and "intracellular 
anchor" . 
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Alternative preferred anchor segments 
(defined by reference to the sequence of Table 38) 
include: 

codons 1-29 joined to codons 104-435, deleting 
5 domain 1 and retaining linker 1 to the end; 

codons 1-38 joined to codons 104-435, deleting 
domain land retaining the rEK cleavage site plus linker 
1 to the end from III; 

codons 1-29 joined to codons 236-435, deleting 
10 domain 1, linker 1, and most of domain 2 and retaining 
linker 2 to the end; 

codons 1-38 joined to codons 236-435, deleting 
domain 1, linker 1, and most of domain 2 and retaining 
linker 2 to the end and the rEK cleavage site; 
15 codons 1-29 joined to codons 236-435 and changing 

codon 240 to Ser(e.g. , age), deleting domain 1, linker 
1, and most of domain 2 and retaining linker 2 to the 
end; and 

codons 1-38 joined to codons 236-435 and changing 
20 codon 240 to Ser(e.g., age), deleting domain 1, linker 
1, and most of domain 2 and retaining linker 2 to the 
end and the rEK cleavage site. 

The constructs would most readily be made by 
methods similar to those of Wang and Wilkinson 
25 ( Biotechniaues 2001: 31(4)722-724) in which PCR is used 
to copy the vector except the part to be deleted and 
matching restriction sites are introduced or retained 
at either end of the part to be kept. Table 39 shows 
the oligonucleotides to be used in deleting parts of 
30 the III anchor segment. The DNA shown in Table 38 has 
an Nhel site before the DINDDRMA recombinant 
enterokinase cleavage site (rEKCS) . If Nhel is used in 
the deletion process with this DNA, the rEKCS site 
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would be lost. This site could be quite useful in 
cleaving Fabs from the phage and might facilitate 
capture of very high-af f f inity antibodies- One could 
mutagenize this sequence so that the Nhel site would 
5 follow the rEKCS site, an Ala Ser amino-acid sequence 
is already present. Alternatively, one could use SphI 
for the deletions. This would involve a slight change 
in amino acid sequence but would be of no consequence. 

Example 7 : Selection of antigen binders from an 
10 enriched library of human antibodies using phage vector 
DY3F31 . 

In this example the human antibody library 
used is described in de Haard et al., ( Journal of 
Biological Chemistry , 274 (26): 18218-30 (1999). This 

15 library, consisting of a large non-immune human Fab 

phagemid library, was first enriched on antigen, either 
on streptavidin or on phenyl-oxazolone (phOx) . The 
methods for this are well known in the art. Two 
preselected Fab libraries, the first one selected once 

20 on immobilized phOx-BSA (Rl-ox) and the second one 

selected twice on streptavidin (R2-strep) , were chosen 
for recloning. 

These enriched repertoires of phage 
antibodies, in which only a very low percentage have 

25 binding activity to the antigen used in selection, were 
confirmed by screening clones in an ELISA for antigen 
binding. The" selected Fab genes were transferred from 
the phagemid vector of this library to the DY3F31 
vector via ApaLl-Notl restriction sites. 

30 DNA from the DY3F31 phage vector was 

pretreated with ATP dependent DNAse to remove 
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chromosomal DNA and then digested with ApaLl and Afotl. 
An extra digestion with AscI was performed in between 
to prevent self-ligation of the vector. The ApaLl/NotI 
Fab fragment from the preselected libraries was 
5 subsequently ligated to the vector DNA and transformed 
into competent XLl-blue MRF* cells. 

Libraries were made using vector : insert 
ratios of 1:2 for phOx-library and 1:3 for STREP 
library, and using 100 ng ligated DNA per 50 \il of 

10 electroporation-competent cells (electroporation 

conditions : one shock of 1700 V, 1 hour recovery of 
cells, in rich. SOC medium, plating on amplicillin- _ •_ 
containing agar plates) . 

This transformation resulted in a library 

15 size of 1.6 x 10 6 for Rl-ox in DY3F31 and 2.1 x 10 6 for 
R2-strep in DY3F31. Sixteen colonies from each library 
were screened for insert, and all showed the correct 
size insert (±1400 bp) (for both libraries) . 

Phage was prepared from these Fab libraries 

20 as follows. A representative sample of the library was 
inoculated in medium with ampicillin and glucose, and 
at OD 0.5, the medium exchanged for ampicillin and 1 mM 
IPTG. After overnight growth at 37 °C, phage was 
harvested from the supernatant by PEG-NaCl 

25 precipitation. Phage was used for selection on antigen. 
Rl-ox was selected on phOx-BSA coated by passive 
adsorption onto immunotubes and R2-strep on 
streptavidin coated paramagnetic beads (Dynal, Norway), 
in procedures described in de Haard et. al. and Marks 

30 et. al., Journal of Molecular Biology , 222(3): 581-97 
(1991) . Phage titers and enrichments are given in 
Table 40. 
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Clones from these selected libraries, dubbed 
R2-ox and R3-strep respectively, were screened for 
binding to their antigens in ELISA. 44 clones from 
each selection were picked randomly and screened as 
5 phage or soluble Fab for binding in ELISA. For the 

libraries in DY3F31, clones were first grown in 2TY-2% 
glucose-50 ]ig/ml AMP to an OD600 of approximately 0.5, 
and then grown overnight in 2TY-50 \ig/ml AMP +/- ImM 
IPTG. Induction with IPTG may result in the production 

10 of both phage-Fab and soluble Fab. Therefore the 

(same) clones were also grown without IPTG. Table 41 
shows the results of an ELISA screening of the 
resulting supernatant, either for the detection of 
phage particles with antigen binding (Anti-M13 HRP = 

15 anti-phage antibody) , or for the detection of human 
Fabs, be it on phage or as soluble fragments, either 
with using the anti-myc antibody 9E10 which detects the 
myc-tag that every Fab carries at the C-terminal end of 
the heavy chain followed by a HRP-labeled 

20 rabbit-anti-Mouse serum (column 9E10/RAM-HRP) , or with 
anti-light chain reagent followed by a HRP-labeled 
goat-anti-rabbit antiserum (anti-CK/CL Gar-HRP) . 

The results shows that in both cases 
antigen-binders are identified in the library, with as 

25 Fabs on phage or with the anti-Fab reagents (Table 41) . 
IPTG induction yields an increase in the number of 
positives. Also it can be seen that for the 
phOx-clones, the phage ELISA yields more positives than 
the soluble Fab ELISA, most likely due to the avid 

30 binding of phage. Twenty four of the ELISA-positive 
clones were screened using PCR of the Fab-insert from 
the vector, followed by digestion with BstNI. This 
yielded 17 different patterns for the phOx-binding 
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Fab's in 23 samples that were correctly analyzed, and 6 
out of 24 for the streptavidin binding clones. Thus, 
the data from the selection and screening from this 
pre-enriched non-immune Fab library show that the 
5 DY3F31 vector is suitable for display and selection of 
Fab fragments, and provides both soluble Fab and Fab on 
phage for screening experiments after selection, 

Example 8: Selection of Phage-antibody libraries on 
streptavidin magnetic beads. 

10 The following example describes a selection 

in which one first depletes a sample of the library of 
binders to streptavidin and optionally of binders to a 
non-target (i.e., a molecule other than the target that 
one does not want the selected Fab to bind) . It is 

15 hypothesized that one has a molecule, termed a 

"competitive ligand", which binds the target and that 
an antibody which binds at the same site would be 
especially useful. 

For this procedure Streptavidin Magnetic 

20 Beads (Dynal) were blocked once with blocking solution 
(2% Marvel Milk, PBS (pH 7.4), 0.01% Tween-20 
("2%MPBST") ) for 60 minutes at room temperature and 
then washed five times with 2%MPBST. 4 50 ]iL of beads 
were blocked for each depletion and subsequent 

25 selection set. 

Per selection, 6.25 \xL of biotinylated 
depletion target (1 mg/mL stock in PBST) was added to 
0.250 mL of washed, blocked beads (from step 1). The 
target was allowed to bind overnight, with tumbling, at 

30 4°C. The next day, the beads are washed 5 times with 
PBST. 
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Per selection, 0.010 mL of biotinylated 
target antigen (1 mg/mL stock in PBST) was added to 
0.100 mL of blocked and washed beads (from step 1). 
The antigen was allowed to bind overnight, with 
5 tumbling, at 4°C. The next day, the beads were washed 
5 times with PBST. 

In round 1, 2 X 10 12 up to 10 13 plaque forming 
units (pfu) per selection were blocked against 
non-specific binding by adding to 0.500 mL of 2%MPBS 

10 (=2%MPBST without Tween) for 1 hr at RT (tumble) . In 
later rounds, 1011 pfu per selection were blocked as 
done in round 1. 

Each phage pool was incubated with 50 \iL of 
depletion target beads (final wash supernatant removed 

15 just before use) on a Labquake rotator for 10 min at 
room temperature. After incubation, the phage 
supernatant was removed and incubated with another 50 
\iL of depletion target beads. This was repeated 3 more 
times using depletion target beads and twice using 

20 blocked streptavidin beads for a total of 7 rounds of 
depletion, so each phage pool required 350 pL of 
depletion beads. 

A small sample of each depleted library pool 
was taken for titering. Each library pool was added to 

25 0.100 mL of target beads (final wash supernatant was 
removed just before use) and allowed to incubate for 2 
hours at room temperature (tumble) . 

Beads were then washed as rapidly as possible 
(e.g., 3 minutes total) with 5 X 0.500 mL PBST and then 

30 2X with PBS. Phage still bound to beads after the 

washing were eluted once with 0.250 mL of competitive 
ligand (-1 uuM) in PBST for 1 hour at room temperature 
on a Labquake rotator. The eluate was removed, mixed 
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with 0.500 mL Minimal A salts solution and saved- For 
a second selection, 0,500 mL 100 mM TEA was used for 
elution for 10 min at RT, then neutralized in a mix of 
0.250 mL of 1 M Tris, pH 7.4 + 0.500 mL Min A salts. 
5 After the first selection elution, the beads 

can be eluted again with 0.300 mL of non-biotinylated 
target (1 mg/mL) for 1 hr at RT on a Labquake rotator. 
Eluted phage are added to 0.450 mL Minimal A salts. 

Three eluates (competitor from 1st selection, 

10 target from 1st selection and neutralized TEA elution 
from 2nd selection) were kept separate and a small 
aliquot taken from each for titering. 0.500 mL Minimal 
A salts were added to the remaining bead aliquots after 
competitor and target elution and after TEA elution. 

15 Take a small aliquot from each was taken for tittering. 

Each elution and each set of eluted beads was 
mixed with 2X YT and an aliquot (e.g., 1 mL with 1. E 
10 /mL) of XLl-Blue MRF' E. coli cells (or other F' cell 
line) which had been chilled on ice after having been 

20 grown to mid-logarithmic phase, starved and 

concentrated (see procedure below - "Mid-Log prep of 
XL-1 blue MRF' cells for infection"). 

After approximately 30 minutes at room 
temperature, the phage/cell mixtures were spread onto 

25 Bio-Assay Dishes (243 X 243 X 18 mm, Nalge Nunc) 
containing 2XYT, ImM IPTG agar. The plates were 
incubated overnight at 30 °C. The next day, each 
amplified phage culture was harvested from its 
respective plate. The plate was flooded with 35 mL TBS 

30 or LB, and cells were scraped from the plate. The 
resuspended cells were transferred to a centrifuge 
bottle. An additional 20 mL TBS or LB was used to 
remove any cells from the plate and pooled with the 
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cells in the centrifuge bottle. The cells were 
centrifuged out, and phage in the supernatant was 
recovered by PEG precipitation. Over the next day, the 
amplified phage preps were titered. 
5 In the first round, two selections yielded 

five amplified eluates. These amplified eluates were 
panned for 2-3 more additional rounds of selection 
using -1. E 12 input phage/round. For each additional 
round, the depletion and target beads were prepared the 

10 night before the round was initiated. 

For the elution steps in subsequent rounds, 
all elutions up to the elution step from which the 
amplified elution came from were done, and 
the previous elutions were treated as washes. For the 

15 bead infection amplified -phage, for example, the 

competitive ligand and target elutions were done and 
then tossed as washes {see below) . Then the beads were 
used to infect E. coli. Two pools, therefore, yielded 
a total of 5 final elutions at the end of the 

20 selection. 

1st selection set 

A. Ligand amplified elution: elute w/ ligand 
for 1 hr,. keep as elution 

25 B. Target amplified elution: elute w/ ligand 

for 1 hr, toss as wash elute w/ target for 1 
hr, keep as elution 

C. Bead infect, amp. elution: elute w/ 
ligand for 1 hr, toss as wash elute w/ target 
30 for 1 hr, toss as wash elute w/ cell 

infection, keep as elution 
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2nd selection set 

A. TEA amplified elution; elute w/ TEA 
lOmin, keep as elution 

B. Bead infect . amp. elution; elute w/ 
5 TEA lOmin, toss as wash elute w/ cell 

infection, keep as elution 

Mid- log prep of XL1 blue MRF' cells for infection 

(based on Barbas et al. Phage Display manual procedure) 

Culture XL1 blue MRF 1 in NZCYM (12.5 mg/mL 
10 tet) at 37°C and 250 rpm overnight. Started a 500 mL 
culture in 2 liter flask by diluting cells 1/50 in 
NZCYM/tet (10 mL overnight culture added) and incubated 
at 37°C at 250 rpm until OD600 of 0,45 (1.5-2 hrs) was 
reached. Shaking was reduced to 100 rpm for 10 min. 
15 When OD600 reached between 0.55-0.65, cells were 
transferred to 2 x 250 mL centrifuge bottles, 
centrifuged at 600 g for 15 min at 4°C. Supernatant 
was poured off. Residual liquid was removed with a 
pipette. 

20 The pellets were gently resuspended (not 

pipetting up and down) in the original volume of 1 X 
Minimal A salts at room temp. The resuspended cells 
were transferred back into 2-liter flask, shaken at 100 
rpm for 45 min at 37°C. This process was performed in 

25 order to starve the cells and restore pili. The cells 
were transferred to 2 x 250 mL centrifuge bottles, and 
centrifuged as earlier. 

The cells were gently resuspended in ice cold 
Minimal A salts (5 mL per 500 mL original culture) . 
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The cells were put on ice for use in infections as soon 
as possible. 

The phage eluates were brought up to 7,5 mL 
with 2XYT medium and 2.5 mL of cells were added. Beads 
5 were brought up to 3 mL with 2XYT and 1 mL of cells 
were added. Incubated at 37oC for 30 min. The cells 
were plated on 2XYT, 1 mM IPTG agar large NUNC plates 
and incubated for 18 hr at 30 °C. 

Example 9: Incorporation of synthetic region in FR1/3 
10 region. 

Described below are examples for 
incorporating of fixed residues in antibody sequences 
for light chain kappa and lambda genes, and for heavy 
chains. The experimental conditions and 

15 oligonucleotides used for the examples below have been 
described in previous examples (e.g., Examples 3 & 4). 

The process for incorporating fixed FR1 
residues in an antibody lambda sequence consists of 3 
steps (see FIG. 18) : (1) annealing of single-stranded 

20 DNA material encoding VL genes to a partially 

complementary oligonucleotide mix (indicated with Ext 
and Bridge), to anneal in this example to the region 
encoding residues 5-7 of the FR1 of the lambda genes 
(indicated with X. .X; within the lambda genes the 

25 overlap may sometimes not be perfect); (2) ligation of 
this complex; (3) PCR of the ligated material with the 
indicated primer ('PCRpr') and for example one primer 
based within the VL gene. In this process the first few 
residues of all lambda genes will be encoded by the 

30 sequences present in the oligonucleotides (Ext., Bridge 
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or PCRpr) . After the PCR, the lambda genes can be 
cloned using the indicated restriction site for ApaLI . 

The process for incorporating fixed FR1 
residues in an antibody kappa sequence (FIG. 19) 
5 consists of 3 steps : (1) annealing of single-stranded 
DNA material encoding VK genes to a partially 
complementary oligonucleotide mix (indicated with Ext 
and Bri), to anneal in this example to the region 
encoding residues 8-10 of the FR1 of the kappa genes 

10 (indicated with X, .X; within the kappa genes the 

overlap may sometimes not be perfect) ; (2) ligation of 
this complex; (3) PCR of the ligated material with the 
indicated primer ('PCRpr') and for example one primer 
based within the VK gene. In this process the first few 

15 (8) residues of all kappa genes will be encode by the 
sequences present in the oligonucleotides (Ext., Bridge 
or PCRpr.). After the PCR, the kappa genes can be 
cloned using the indicated restriction site for ApaLI. 
The process of incorporating fixed FR3 

20 residues in a antibody heavy chain sequence (FIG. 20) 
consists of 3 steps : (1) annealing of single-stranded 
DNA material encoding part of the VH genes (for example 
encoding FR3, CDR3 and FR4 regions) to a partially 
complementary oligonucleotide mix (indicated with Ext 

25 and Bridge), to anneal in this example to the region 
encoding residues 92-94 (within the FR3 region) of VH 
genes (indicated with X. .X; within the VH genes the 
overlap may sometimes not be perfect); (2) ligation of 
this complex; (3) PCR of the ligated material with the 

30 indicated primer ('PCRpr') and for example one primer 
based within the VH gene (such as in the FR4 region) . 
In this process certain residues of all VH genes will 
be encoded by the sequences present in the 
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oligonucleotides used here, in particular from PCRpr 
(for residues 70-73) , or from Ext/Bridge 
oligonucleotides (residues 74-91) . After the PCR, the 
partial VH genes can be cloned using the indicated 
5 restriction site for Xbal. 

It will be understood that the foregoing is 
only illustrative of the principles of this invention 
and that various modifications can be made by those 
skilled in the art without departing from the scope of 
10 and sprit of the invention. 
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Table 1: Human GLG FR3 sequences 
i VH1 



66 


67 


68 


69 70 71 


72 


73 


74 


75 


76 


77 


78 


79 


80 


agg 


gtc 


acc 


atg acc agg 


gac 


acg 


tec 


ate 


age 


aca 


gee 


tac 


atg 


81 


82 


82a 


82b 82c 83 


84 


85 


86 


87 


88 


89 


90 


91 


92 


gag 


ctg 


age agg ctg aga 


tct 


gac 


gac 


acg 


gee 


gtg 


tat 


tac 


tgt 


93 


94 


95 






















gcg 


aga 


ga ! 


l-02# 1 




















aga 


gtc 


acc 


att acc agg 


gac 


aca 


tec 


gcg 


age 


aca 


gee 


tac 


atg 


gag 


ctg 


age 


age ctg aga 


tct 


gaa 


gac 


acg get 


gtg 


tat 


tac 


tgt 


gcg 


aga 


ga ! 


l-03# 2 




















aga 


gtc 


acc 


atg acc agg 


aac 


acc 


tec 


ata 


age 


aca 


gec 


tac 


atg 


gag 


ctg 


age age ctg aga 


tct 


gag 


gac 


acg 


gee 


gtg 


tat 


tac 


tgt 


gcg 


aga 


gg ! 


l-08# 3 




















aga 


gtc 


acc 


atg acc aca 


gac 


aca 


tec 


acg 


age 


aca 


gec 


tac 


atg 


gag 


ctg 


agg 


age ctg aga 


tct 


gac 


gac 


acg 


gee 


gtg 


tat 


tac 


tgt 


gcg 


aga 


ga ! 


1-18# 4 




















aga 


gtc 


acc 


atg acc gag 


gac 


aca 


tct 


aca 


gac 


aca 


gee 


tac 


atg 


gag 


ctg 


age 


age ctg aga 


tct 


gag 


gac 


acg 


gee 


gtg 


tat 


tac 


tgt 


gca 


aca 


ga ! 


i 1-24 # 5 




















aga 


gtc 


acc 


att acc agg 


gac 


agg 


tct 


atg 


age 


aca 


gee 


tac 


atg 


gag 


ctg 


age 


age ctg aga 


tct 


gag 


gac 


aca 


gee 


atg 


tat 


tac 


tgt 


gca 


aga 


ta ! 


l-45# 6 




















aga 


gtc 


acc 


atg acc agg 


gac 


acg 


tec 


acg 


age 


aca 


gtc 


tac 


atg 


gag 


ctg 


age 


age ctg aga 


tct 


gag 


gac 


acg 


gee 


gtg 


tat 


tac 


tgt 


gcg 


aga 


ga ! 


l-46# 7 




















aga 


gtc 


acc 


att acc agg 


gac 


atg 


tec 


aca 


age 


aca 


gee 


tac 


atg 


gag 


ctg 


age 


age ctg aga 


tec 


gag 


gac 


acg 


gec 


gtg 


tat 


tac 


tgt 


gcg 


gca 


ga ! 


l-58# 8 




















aga 


gtc 


acg 


att acc gcg 


gac 


gaa 


tec 


acg 


age 


aca 


gee 


tac 


atg 


gag 


ctg 


age 


age ctg aga 


tct 


gag 


gac 


acg 


gee 


gtg 


tat 


tac 


tgt 


gcg 


aga 


ga ! 


l-69# 9 




















aga 


gtc 


acg 


att acc gcg 


gac 


aaa 


tec 


acg 


age 


aca 


gec 


tac 


atg 


gag 


ctg 


age age ctg aga 


tct 


gag 


gac 


acg 


gee 


gtg 


tat 


tac 


tgt 


gcg 


aga 


ga ! 


l-e# 10 




















aga 


gtc 


acc 


ata acc gcg 


gac 


acg 


tct 


aca 


gac 


aca 


gee 


tac 


atg 


gag 


ctg 


age 


age ctg aga 


tct 


gag 


gac 


acg 


gee 


gtg 


tat 


tac 


tgt 


gca 


aca 


ga ! 


l-f# 11 
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! VH2 





agg 


etc 


acc 


ate acc aag 


gac 




aca 


atg 


acc 


aac atg gac 


cct 




gca 


cac 


aga 


c! 2-05# 12 




5 


agg 


etc 


acc 


ate tec aag 


gac 




acc 


atg 


acc 


aac atg gac 


cct 




gca 


egg 


ata 


c! 2-26# 13 






agg 


etc 


acc 


ate tec aag 


gac 




aca 


atg 


acc 


aac atg gac 


cct 


10 


gca 
! VH3 


egg 


ata 


c! 2-70# 14 






cga 


ttc 


acc 


ate tec aga 


gac 




caa 


atg 


aac 


age ctg aga 


gee 




gcg 


aga 


ga 


! 3-07# 15 




15 


cga 


ttc 


acc 


ate tec aga 


gac 




caa 


atg 


aac 


agt ctg aga 


get 




gca 


aaa 


gat 


a! 3-09#16 






cga 


ttc 


acc 


ate tec agg 


gac 




caa 


atg aac age ctg aga 


gee 


20 


gcg 


aga 


ga ! 


! 3-ll# 17 






cga 


ttc 


acc 


ate tec aga 


gaa 




caa 


atg 


aac 


age ctg aga 


gee 




gca 


aga 


ga ! 


! 3-13# 18 






aga 


ttc 


acc 


ate tea aga gat 


25 


caa 


atg 


aac 


age ctg aaa 


acc 




acc 


aca 


ga ! 


! 3-15# 19 






cga 


ttc 


acc 


ate tec aga 


gac 




caa 


atg 


aac 


agt ctg aga 


gee 




gcg 


aga 


ga ! 


! 3-20# 20 




30 


cga 


ttc 


acc 


ate tec aga 


gac 




caa 


atg 


aac 


age ctg aga 


gee 




gcg 


aga 


ga ! 


! 3-21# 21 






egg 


ttc 


acc 


ate tec aga 


gac 




caa 


atg 


aac 


age ctg aga 


gee 


35 


gcg 


aaa 


ga ! 


3-23# 22 






cga 


ttc 


acc 


ate tec aga 


gac 




caa 


atg 


aac 


age ctg aga 


get 




gcg 


aaa 


ga ! 


3-30# 23 






cga 


ttc 


acc 


ate tec aga 


gac 


40 


caa 


atg 


aac 


age ctg aga 


get 




gcg 


aga 


ga ! 


3303# 24 
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acc tec aaa aac cag gtg gtc ctt 
gtg gac aca gec aca tat tac tgt 

acc tec aaa age cag gtg gtc ctt 
gtg gac aca gee aca tat tac tgt 

acc tec aaa aac cag gtg gtc ctt 
gtg gac aca gee acg tat tac tgt 

aac gee aag aac tea ctg tat ctg 
gag gac acg get gtg tat tac tgt 

aac gee aag aac tec ctg tat ctg 
gag gac acg gee. ttg tat tac tgt 

aac gee aag aac tea ctg tat ctg 
gag gac acg gee gtg tat tac tgt 

aat gee aag aac tec ttg tat ctt 
ggg gac acg get gtg tat tac tgt 

gat tea aaa aac acg ctg tat ctg 
gag gac aca gee gtg tat tac tgt 

aac gee aag aac tec ctg tat ctg 
gag gac acg gee ttg tat cac tgt 

aac gee aag aac tea ctg tat ctg 
gag gac acg get gtg tat tac tgt 

aat tec aag aac acg ctg tat ctg 
gag gac acg gee gta tat tac tgt 

aat tec aag aac acg ctg tat ctg 
gag gac acg get gtg tat tac tgt 

aat tec aag aac acg ctg tat ctg 
gag gac acg get gtg tat tac tgt 
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cga ttc acc ate tec aga gac aat 
caa atg aac age ctg aga get gag 
gcg aaa ga ! 3305# 25 
cga ttc acc ate tec aga gac aat 
caa atg aac age ctg aga gec gag 
gcg aga ga ! 3-33# 26 
cga ttc acc ate tec aga gac aac 
caa atg aac agt ctg aga act gag 
gca aaa gat a! 3-43#27 
cga ttc acc ate tec aga gac aat 
caa atg aac age ctg aga gac gag 
gcg aga ga ! 3-48# 28 
aga ttc acc ate tea aga gat ggt 
caa atg aac age ctg aaa acc gag 
act aga ga ! 3-49# 29 
cga ttc acc ate tec aga gac aat 
caa atg aac age ctg aga gec gag 
gcg aga ga ! 3-53# 30 
aga ttc acc ate tec aga gac aat 
caa atg ggc age ctg aga get gag 
gcg aga ga i 3-64# 31 
aga ttc acc ate tec aga gac aat 
caa atg aac age ctg aga get gag 
gcg aga ga ! 3-66# 32 
aga ttc acc ate tea aga gat gat 
caa atg aac age ctg aaa acc gag 
get aga ga ! 3-72# 33 
agg ttc acc ate tec aga gat gat 
caa atg aac age ctg aaa acc gag 
act aga ca ! 3-73# 34 
cga ttc acc ate tec aga gac aac 
caa atg aac agt ctg aga gec gag 
gca aga ga ! 3-74# 35 
aga ttc acc ate tec aga gac aat 
caa atg aac age ctg aga get gag 
aag aaa ga ! 3-d# 36 
VH4 

cga gtc acc ata tea gta gac aag 
aag ctg age tct gtg acc gec gcg 
gcg aga ga ! 4-04# 37 
cga gtc acc atg tea gta gac acg 



tec aag aac acg ctg tat ctg 
gac acg get gtg tat tac tgt 

tec aag aac acg ctg tat ctg 
gac acg get gtg tat tac tgt 

age aaa aac tec ctg tat ctg 
gac acc gee ttg tat tac tgt 

gec aag aac tea ctg tat ctg 
gac acg get gtg tat tac tgt 

tec aaa age ate gee tat ctg 
gac aca gec gtg tat tac tgt 

tec aag aac acg ctg tat ctt 
gac acg gee gtg tat tac tgt 

tec aag aac acg ctg tat ctt 
gac atg get gtg tat tac tgt 

tec aag aac acg ctg tat ctt 
gac acg get gtg tat tac tgt 

tea aag aac tea ctg tat ctg 
gac acg gee gtg tat tac tgt 

tea aag aac acg gcg tat ctg 
gac acg gee gtg tat tac tgt 

gec aag aac acg ctg tat ctg 
gac acg get gtg tat tac tgt 

tec aag aac acg ctg cat ctt 
gac acg get gtg tat tac tgt 

tec aag aac cag ttc tec ctg 
gac acg gee gtg tat tac tgt 

tec aag aac cag ttc tec ctg 
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aag ctg age tct gtg acc gec gtg gac acg gec gtg tat tac tgt 
gcg aga aa ! 4-28# 38 

cga gtt acc ata tea gta gac acg tct aag aac cag ttc tec ctg 
aag ctg age tct gtg act gee gcg gac acg gec gtg tat tac tgt 
5 gcg aga ga ! 4301# 39 

cga gtc acc ata tea gta gac agg tec aag aac cag ttc tec ctg 
aag ctg age tct gtg acc gee gcg gac acg gec gtg tat tac tgt 
gee aga ga ! 4302# 40 

cga gtt acc ata tea gta gac acg tec aag aac cag ttc tec ctg 
10 aag ctg age tct gtg act gee gca gac acg gec gtg tat tac tgt 

gec aga ga ! 4 304# 41 

cga gtt acc ata tea gta gac acg tct aag aac cag ttc tec ctg 
aag ctg age tct gtg act gee gcg gac acg gec gtg tat tac tgt 
gcg aga ga ! 4-31# 42 
15 cga gtc acc ata tea gta gac acg tec aag aac cag ttc tec ctg 

aag ctg age tct gtg acc gee gcg gac acg get gtg tat tac tgt 
gcg aga ga ! 4-34# 43 

cga gtc acc ata tec gta gac acg tec aag aac cag ttc tec ctg 
aag ctg age tct gtg acc gee gca gac acg get gtg tat tac tgt 
20 gcg aga ca ! 4-39# 44 

cga gtc acc ata tea gta gac acg tec aag aac cag ttc tec ctg 
aag ctg age tct gtg acc get gcg gac acg gec gtg tat tac tgt 
gcg aga ga ! 4-59# 45 

cga gtc acc ata tea gta gac acg tec aag aac cag ttc tec ctg 
25 aag ctg age tct gtg acc get gcg gac acg gec gtg tat tac tgt 

gcg aga ga ! 4-61# 46 

cga gtc acc ata tea gta gac acg tec aag aac cag ttc tec ctg 
aag ctg age tct gtg acc gee gca gac acg gec gtg tat tac tgt 
gcg aga ga ! 4-b# 47 
30 ! VH5 

cag gtc acc ate tea gee gac aag tec ate age acc gee tac ctg 
cag tgg age age ctg aag gee teg gac acc gec atg tat tac tgt 
gcg aga ca ! 5-51# 48 

cac gtc acc ate tea get gac aag tec ate age act gee tac ctg 
35 cag tgg age age ctg aag gee teg gac acc gec atg tat tac tgt 

gcg aga ! 5-a# 4 9 
! VH6 

cga ata acc ate aac cca gac aca tec aag aac cag ttc tec ctg 
cag ctg aac tct gtg act ccc gag gac acg get gtg tat tac tgt 
40 gca aga ga ! 6-l# 50 

! VH7 
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egg ttt gtc ttc tec ttg gac ace tct gtc age acg gca tat ctg 
cay ate tyc age eta aag get gag gac act gee gtg tat tac tgt 
gcg aga ga ! 74. 1# 51 
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Table 2: Enzymes that either cut 15 or mote human GLGs or have 5+-base recognition in FR3 
Typical entry: 

REnarae Recognition #sites 

GLGid# : base# GLGid# : base# GLGidl : base# 

5 

BstEII Ggtnacc 2 
1: 3 48: 3 
There are 2 hits at base# 3 



10 Maelll gtnac 36 





1: 


4 


2: 




3: 


4 


4: 


4 


5: 


4 


6: 


4 




7: 


4 


8: 




9: 


4 


10: 


4 


11: 


4 


37: 


4 




37: 


58 


38: 




38: 


58 


39: 


4 


39: 


58 


40: 


4 




40: 


58 


41: 




41: 


58 


42: 


4 


42: 


58 


43: 


4 


15 


43: 


58 


44: 




44: 


58 


45: 


4 


45: 


58 


46: 


4 




46: 


58 


47: 




47: 


58 


48: 


4 


49: 


4 


50: 


58 




There are 24 


hits at 


base# 4 














Tsp45I 


gtsac 










33 










20 


1: 


4 


2: 


4 


3: 


4 


4: 


4 


5: 


4 


6: 


4 




7: 


4 


8: 


4 


9: 


4 


10: 


4 


11: 


4 


37: 


4 




37: 


58 


38: 


4 


38: 


58 


39: 


58 


40: 


4 


40: 


58 




41: 


58 


42: 


58 


43: 


4 


43: 


58 


44: 


4 


44: 


58 




45: 


4 


45: 


58 


46: 


4 


46: 


58 


47: 


4 


47: 


58 


25 


48: 


4 


49: 


4 


50: 


58 
















There 


are 21 


hits at 


base* 


4 














HphI 


tcacc 










45 












1: 


5 


2: 


5 


3: 


5 


4: 


5 


5: 


5 


6: 


5 


30 


7: 


5 


8: 


5 


11: 


5 


12: 


5 


12: 


11 


13: 


5 




14: 


5 


15: 


5 


16: 


5 


17: 


5 


18: 


5 


19: 


5 




20: 


5 


21: 


5 


22: 


5 


23: 


5 


24: 


5 


25: 


5 




26: 


5 


27: 


5 


28: 


5 


29: 


5 


30: 


5 


31: 


5 




32: 


5 


33: 


5 


34: 


5 


35: 


5 


36: 


5 


37: 


5 


35 


38: 


5 


40: 


5 


43: 


5 


44: 


5 


45: 


5 


46: 


5 




47: 


5 


48: 


5 


49: 


5 















There are 44 hits at base# 5 
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Nlalll CATG 2 6 





1 ; 


3 


I : 


42 


2 : 


42 


3 : 


9 


3 : 


42 


4 : 


9 




4: 


42 


5: 


9 


5: 


42 


6: 


42 


6: 


78 


7: 


9 




7: 


42 


8: 


21 


8: 


42 


9: 


42 


10: 


42 


11: 


42 


5 


12: 


57 


13: 


48 


13: 


57 


14: 


57 


31: 


72 


38: 


9 



48: 78 49: 78 

There are 11 hits at base# 42 

There are 1 hits at base# 48 Could cause raggedness. 

10 BsaJI Ccnngg 37 



20 



1: 14 


2: 


14 


5: 


14 


6: 


14 


7: 


14 


8: 


14 


8: 65 


9: 


14 


10: 


14 


11: 


14 


12: 


14 


13: 


14 


14: 14 


15: 


65 


17: 


14 


17: 


65 


18: 


65 


19: 


65 


20: 65 


21: 


65 


22: 


65 


26: 


65 


29: 


65 


30: 


65 


33: 65 


34: 


65 


35: 


65 


37: 


65 


38: 


65 


39: 


65 


40: 65 


42: 


65 


43: 


65 


48: 


65 


49: 


65 


50: 


65 


51: 14 






















There arc 


a 23 hits at 


base# 65 












There are 14 


hits at 


base# 14 












Uul AGct 










42 










1: 47 


2: 


47 


3: 


47 


4 : 


47 


5: 


47 


6: 


47 


7: 47 


8: 


47 


9: 


47 


10: 


47 


11: 


47 


16: 


63 


23: 63 


24: 


63 


25: 


63 


31: 


63 


32: 


63 


36: 


63 


37; 47 


37: 


52 


38: 


47 


38: 


52 


39: 


47 


39: 


52 


40: 47 


40: 


52 


41: 


47 


41: 


52 


42: 


47 


42: 


52 


43: 47 


43: 


52 


44: 


47 


44: 


52 


45: 


47 


45: 


52 


46: 47 


46: 


52 


47: 


47 


47: 


52 


49: 


15 


50: 


47 



There are 23 hits at base# 47 

30 There a re 11 hits at base# 52 Only 5 bases from 4 7 

BlpI GC triage 21 

1: 48 2: 48 3: 48 5: 48 6: 48 7: 48 
8: 48 9: 48 10: 48 11: 48 37: 48 38: 48 
35 39: 48 40: 48 41: 48 42: 48 43: 48 44: 48 
45: 48 46: 48 47: 48 
There are 21 hits at base# 48 
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Mwol GCNNNNNnngc 19 



1: 


48 


2: 


28 


19: 


36 


22: 


36 


23: 


36 


24: 


36 


25: 


36 


26: 


36 


35: 


36 


37: 


67 


39: 


67 


40: 


67 


41: 


67 


42: 


67 


43: 


67 


44: 


67 


45: 


67 


46: 


67 



5 47: 67 

There are 10 hits at base# 67 
There are 7 hits at base# 36 



Ddel Ctnag 71 



1: 


49 


1: 


58 


2: 


49 


2: 


58 


3: 


49 


3: 


58 


3: 


65 


4: 


49 


4: 


58 


5: 


49 


5: 


58 


5: 


65 


6: 


49 


6: 


58 


6: 


65 


7: 


49 


7: 


58 


7: 


65 


8: 


49 


8: 


58 


9: 


49 


9: 


58 


9: 


65 


10: 


49 


10: 


58 


10: 


65 


11: 


49 


11: 


58 


11: 


65 


15: 


58 


16: 


58 


16: 


65 


17: 


58 


18: 


58 


20: 


58 


21: 


58 


22: 


58 


23: 


58 


23: 


65 


24: 


58 


24: 


65 


25: 


58 


25: 


65 


26: 


58 


27: 


58 


27: 


65 


28: 


58 


30: 


58 


31: 


58 


31: 


65 


32: 


58 


32: 


65 


35: 


58 


36: 


58 


36: 


65 


37: 


49 


38: 


49 


39: 


26 


39: 


49 


40: 


49 


41: 


49 


42: 


26 


42: 


49 


43: 


49 


44: 


49 


45: 


49 


46: 


49 


47: 


49 


48: 


12 


49: 


12 


51: 


65 







There are 29 hits at base# 58 

There are 22 hits at base# 49 Only nine base from 58 

There are 16 hits at base# 65 Only seven bases from 58 

25 

Bglll Agatct 11 

1: 61 2: 61 3: 61 4: 61 

7: 61 9: 61 10: 61 11: 61 

There are 10 hits at base# 61 

30 

BstYI Rgatcy 12 

1: 61 2: 61 3: 61 4: 61 

7: 61 8: 61 9: 61 10: 61 

There are 11 hits at base# 61 



5: 61 6: 61 
51: 47 



5: 61 6: 61 
11: 61 51: 47 



35 



WO 02/083872 



PCTYUS02/12405 



10 



15 



20 



25 



30 



35 



- 80 - 

Hpyl88I TCNga 17 

1: 64 2: 64 3: 64 4: 64 5: 64 6: 64 
7: 64 8: 64 9: 64 10: 64 11: 64 16: 57 

20: 57 27: 57 35: 57 48: 67 49: 67 

There are 11 hits at base# 64 

There are 4 hits at base# 57 

There are 2 hits at base# 67 Could be ragged. 
MslI CAYNNnnRTG 44 



1: 72 


2: 72 


3: 72 


4: 


72 


5: 


72 


6: 


72 


7: 72 


8: 72 


9: 72 


10: 


72 


11: 


72 


15: 


72 


17: 72 


18: 72 


19: 72 


21: 


72 


23: 


72 


24: 


72 


25: 72 


26: 72 


28: 72 


29: 


72 


30: 


72 


31: 


72 


32: 72 


33: 72 


34: 72 


35: 


72 


36: 


72 


37: 


72 


38: 72 


39: 72 


40: 72 


41: 


72 


42: 


72 


43: 


72 


44: 72 


45: 72 


46: 72 


47: 


72 


48: 


72 


49: 


72 


50: 72 


51: 72 
















There are 44 hits at base# 72 












BsiEI CGRYcg 




23 










1: 74 


3: 74 


4: 74 


5: 


74 


7: 


74 


8: 


74 


9: 74 


10: 74 


11: 74 


17: 


74 


22: 


74 


30: 


74 


33: 74 


34: 74 


37: 74 


38: 


74 


39: 


74 


40: 


74 


41: 74 


42: 74 


45: 74 


46: 


74 


47: 


74 






There are 23 hits at base# 74 












Eael Yggccr 




23 










1: 74 


3: 74 


4: 74 


5: 


74 


7: 


74 


8: 


74 


9: 74 


10: 74 


11: 74 


17: 


74 


22: 


74 


30: 


74 


33: 74 


34: 74 


37: 74 


38: 


74 


39: 


74 


40: 


74 


41: 74 


42: 74 


45: 74 


46: 


74 


47: 


74 






There aa 


:e 23 hits 


! at base# 74 












EagI Cggccg 




23 










1: 74 


3: 74 


4: 74 


5: 


74 


7: 


74 


8: 


74 


9: 74 


10: 74 


11: 74 


17: 


74 


22: 


74 


30: 


74 


33: 74 


34: 74 


37: 74 


38: 


74 


39: 


74 


40: 


74 


41: 74 


42: 74 


45: 74 


46: 


74 


47: 


74 






There ai 


re 23 hits 


at base# 


74 
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Haelll GGcc 27 



1: 


75 


3: 


75 


4: 


75 


5: 


75 


7: 


75 


8: 


75 


9: 


75 


10: 


75 


11: 


75 


16: 


75 


17: 


75 


20: 


75 


22: 


75 


30: 


75 


33: 


75 


34: 


75 


37: 


75 


38: 


75 


39: 


75 


40: 


75 


41: 


75 


42: 


75 


45: 


75 


46: 


75 


47: 


75 


48: 


63 


49: 


63 















There are 25 hits at base# 75 



10 Bst4CI ACNgt 65 °C 63 Sites There is a third isoschismer 



1: 


86 


2: 


86 


3: 


86 


4: 


86 


5: 


86 


6: 


86 


7: 


34 


7: 


86 


8: 


86 


9: 


86 


10: 


86 


11: 


86 


12: 


86 


13: 


86 


14: 


86 


15: 


36 


15: 


86 


16: 


53 


16: 


86 


17: 


36 


17: 


86 


18: 


86 


19: 


86 


20: 


53 


20: 


86 


21: 


36 


21: 


86 


22: 


0 


22: 


86 


23: 


86 


24: 


86 


25: 


86 


26: 


86 


27: 


53 


27: 


86 


28: 


36 


28: 


86 


29: 


86 


30: 


86 


31: 


86 


32: 


86 


33: 


36 


33: 


86 


34: 


86 


35: 


53 


35: 


86 


36: 


86 


37: 


86 


38: 


86 


39: 


86 


40: 


86 


41: 


86 


42: 


86 


43: 


86 


44: 


86 


45: 


86 


46: 


86 


47: 


86 


48: 


86 


49: 


86 


50: 


86 


51: 


0 


51: 


86 















There are 51 hits at base# 86 All the other sites are well away 



HpyCH4III 


ACNgt 








63 










1: 


86 


2: 


86 


3: 


86 


4: 


86 


5: 


86 


6: 


86 


7: 


34 


7: 


86 


8: 


86 


9: 


86 


10: 


86 


11: 


86 


12: 


86 


13: 


86 


14: 


86 


15: 


36 


15: 


86 


16: 


53 


16: 


86 


17: 


36 


17: 


86 


18: 


86 


19: 


86 


20: 


53 


20: 


86 


21: 


36 


21: 


86 


22: 


0 


22: 


86 


23: 


86 


24: 


86 


25: 


86 


26: 


86 


27: 


53 


27: 


86 


28: 


36 


28: 


86 


29: 


86 


30: 


86 


31: 


86 


32: 


86 


33: 


36 


33: 


86 


34: 


86 


35: 


53 


35: 


86 


36: 


86 


37: 


86 


38: 


86 


39: 


86 


40: 


86 


41: 


86 


42: 


86 


43: 


86 


44: 


86 


45: 


86 


46: 


86 


47: 


86 


48: 


86 


49: 


86 


50: 


86 


51: 


0 


51: 


86 















There are 51 hits at base# 86 
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Hinfl 


Gantc 








2 : 


2 


3: 


2 


4: 


2 


8: 


2 


9: 


2 


9: 


22 


16: 


2 


17: 


2 


18: 


2 


21: 


2 


23: 


2 


24: 


2 


28: 


2 


29: 


2 


30: 


2 


33: 


22 


34: 


22 


35: 


2 


40: 


2 


43: 


2 


44: 


2 



50: 60 

10 There are 38 hits at 



43 



5: 


2 


6: 


2 


7: 


2 


10: 


2 


11: 


2 


15: 


2 


19: 


2 


19: 


22 


20: 


2 


25: 


2 


26: 


2 


27: 


2 


31: 


2 


32: 


2 


33: 


2 


36: 


2 


37: 


2 


38: 


2 


45: 


2 


46: 


2 


47: 


2 



2 



Mlyl GAGTCNNNNNn 18 



2: 


2 


3: 


2 


4: 


2 


5: 


2 


6: 


2 


7: 


2 


8: 


2 


9: 


2 


_ 10_: 


2 


11: 


2 


37: 


2 


38: 


2 


40: 


2 


43: 


2 


44: 


2 


45: 


2 


46: 


2 


47: 


2 



There are 18 hits at base# 2 



Pie I gagtc 



20 



25 



30 



35 



2 
8 
40 



2 
2 
2 



3 
9 
43 



2 4 
2 10 
2 44 



There are 18 hits at 

Acil Ccgc 

2: 26 9: 14 10: 
37: 65 38: 62 39: 
42: 65 43: 62 43: 
46: 62 47: 62 47: 
There are 8 hits at 
There are 8 hits at 
There are 3 hits at 
There are 3 hits at 
There are 1 hits at 
There are 1 hits at 

-"- Gcgg 

8: 91 9: 16 10: 
40: 67 42: 67 43: 
There are 7 hits at 
There are 3 hits at 
There are 1 hits at 



18 

2 5 
2 11 
2 45 
base# 2 

24 
11: 14 
40: 62 



14 
65 
65 



65 



48: 35 
base# 62 
base# 65 

base# 14 
base# 74 
base# 26 
base# 35 

11 

16 11: 16 
67 45: 67 
base# 67 
baset 16 
base# 91 



6 
37 
46 



2 
2 
2 



27: 74 
40: 65 



44: 62 44: 65 



37: 67 
46: 67 



7: 
38: 
47: 



2 
2 
2 



37: 62 
41: 65 
45: 62 



48: 74 49: 74 



39: 67 
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BsiHKAI GWGCWc 20 





2: 


30 


4: 


30 


6: 


30 


7: 


30 


9: 


30 


10: 


30 




12: 


89 


13: 


89 


14: 


89 


37: 


51 


38: 


51 


39: 


51 


5 


40: 


51 


41: 


51 


42: 


51 


43: 


51 


44: 


51 


45: 


51 




46: 


51 


47: 


51 



















There are 11 hits at base# 51 



Bspl286I GDGCHc 20 



10 


2: 


30 


4: 


30 


6: 


30 


7: 


30 


9: 


30 


10: 


30 




12: 


89 


13: 


89 


14: 


89 


37: 


51 


38: 


51 


39: 


51 




40: 


51 


41: 


51 


42: 


51 


43: 


51 


44: 


51 


45: 


51 




46: 


51 


47: 


51 



















There are 11 hits at base# 51 

15 

HgiAI GWGCWc 20 



2: 


30 


4: 


30 


6: 


30 


7: 


30 


9: 


30 


10: 


30 


12: 


89 


13: 


89 


14: 


89 


37: 


51 


38: 


51 


39: 


51 


40: 


51 


41: 


51 


42: 


51 


43: 


51 


44: 


51 


45: 


51 


46: 


51 


47: 


51 



















There are 11 hits at base# 51 



BsoFI GCngc 

2: 53 3: 53 
25 8: 91 9: 53 
37: 64 39: 64 
44: 64 45: 64 
50: 45 51: 53 
There are 13 h 
30 There are 10 h 



26 

5: 53 6: 53 

10: 53 11: 53 

40: 64 41: 64 

46: 64 47: 64 

s at base# 53 
s at base# 64 



7: 


53 


8: 


53 


31: 


53 


36: 


36 


42: 


64 


43: 


64 


48: 


53 


49: 


53 



Tsel Gcwgc 17 

2: 53 3: 53 5: 53 6: 53 7: 53 8: 53 
9: 53 10: 53 11: 53 31: 53 36: 36 45: 64 
46: 64 48: 53 49: 53 50: 45 51: 53 
35 There are 13 hits at base# 53 
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Mnll gagg 



34 







j : 


»3 


h : 


31 






Z3 • 


O 1 


o « 


o / 




7: 67 


8: 


67 


9: 


67 


10.: 


67 


11: 


67 




67 




16* 67 


17: 


67 


19: 


67 


20: 


67 


91 • 


fi7 


oo . 
■ 


67 


•J 


3^- 67 


24: 


67 


25: 


67 


26: 


67 




fi7 


9R « 


67 




29: 67 


30: 


67 


31: 


67 


32: 


67 


33: 


67 


34: 


67 




35: 67 


36: 


67 


50: 


67 


51: 


67 












There « 


ire 31 hits at 


base# 67 












10 


HpyCH4V 


TGca 










34 












5: 90 


6: 


90 


11: 


90 


12: 


90 


13: 


90 


14: 


90 




15: 44 


16: 


44 


16: 


90 


17: 


44 


16: 


90 


19: 


44 




20: 44 


21: 


44 


22: 


44 


23: 


44 


24: 


44 


25: 


44 




26: 44 


27: 


44 


27: 


90 


28: 


44 


29: 


44 


33: 


44 


15 


34: 44 


35: 


44 


35: 


90 


36: 


38 


48: 


44 


49: 


44 




50: 44 


50: 


90 


51: 


44 


51: 


52 











There are 

There are 



21 hits at base# 44 

1 hits at base# 52 



38: 16 
44: 16 



20 AccI GTmkac 

7: 37 11: 24 37: 16 
41: 16 42: 16 43: 16 
47: 16 

There are 11 hits at base# 16 

25 

SacII CCGCgg 

9: 14 10: 14 11: 14 37: 
42: 65 43: 65 

There are 5 hits at base# 65 
30 There are 3 hits at base# 14 



13 5-base recognition 



8 

65 



39: 16 
45: 16 



40: 
46: 



16 
16 



6-base recognition 
39: 65 40: 65 



Tfil Gawtc 



24 





9: 


22 


15: 


2 


16: 


2 


17: 


2 


18: 


2 


19: 


2 




19: 


22 


20: 


2 


21: 


2 


23: 


2 


24: 


2 


25: 


2 


35 


26: 


2 


27: 


2 


28: 


2 


29: 


2 


30: 


2 


31: 


2 




32: 


2 


33: 


2 


33: 


22 


34: 


22 


35: 


2 


36: 


2 



There are 20 hits at base# 2 
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BsmAI Nnnnnngagac 




19 












15: 11 16: 11 20: 


11 21: 


11 


22: 


11 


23: 


11 




24: 11 25: 11 26: 


11 27: 


11 


28: 


11 


28: 


56 




30: 11 31: 11 32: 


11 35: 


11 


36: 


11 


44: 


87 


5 


48: 87 
















There are 16 hits at 


base# 11 














Bpml ctccag 




19 












15: 12 16: 12 17: 


12 18: 


12 


20: 


12 


21: 


12 


10 


22: 12 23: 12 24: 


12 25: 


12 


26: 


12 


27: 


12 




28: 12 30: 12 31: 


12 32: 


12 


34: 


12 


35: 


12 




36: 12 
















There are 19 hits at 


base# 12 












15 


XmnI GAANNnnttc 




12 












37: 30 38: 30 39: 


30 40: 


30 


41: 


30 


42: 


30 




43: 30 44: 30 45: 


30 46: 


30 


47: 


30 


50: 


30 




There are 12 hits at 


base# 30 












20 


BsrI NCcagt 




12 












37: 32 38: 32 39: 


32 40: 


32 


41: 


32 


42: 


32 




43: 32 44: 32 45: 


32 46: 


32 


47: 


32 


50: 


32 




There are 12 hits at 


base# 32 












25 


Banll GRGCYc 




11 












37: 51 38: 51 39: 


51 40: 


51 


41: 


51 


42: 


51 




43: 51 44: 51 45: 


51 46: 


51 


47: 


51 








There are 11 hits at 


base# 51 












30 


Ecll36I GAGctc 




11 












37: 51 38: 51 39: 


51 40: 


51 


41: 


51 


42: 


51 




43: 51 44: 51 45: 


51 46: 


51 


47: 


51 








There are 11 hits at 


base# 51 












35 


Sac I GAGCTc 




11 












37: 51 38: 51 39: 


51 40: 


51 


41: 


51 


42: 


51 




43: 51 44: 51 45: 


51 46: 


51 


47: 


51 








There are 11 hits at 


base# 51 
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10 



25 



Table 3: Synthetic 3-23 FR3 of human heavy chains showning positions of possible cleavage sites 

! Sites engineered into the synthetic gene are shown in upper case 
DNA 

! with the RE name between vertical bars (as in | Xbal | ) . 

! RERSs frequently found in GLGs are shown below the synthetic 

sequence 

! with the name to the right (as in gtn ac=MaeIII (24 ) , indicating 
that 

24 of the 51 GLGs contain the site) . 



I FR3 

89 90 (codon # 
in 

! R F 

15 synthetic 3-23) 

|cgc|ttc| 6 

Allowed DNA jcgnlttyl 



20 Hinfl(38) 
i 

Plel(18) 
! 

Tfil(20) 



MaeIII(24) 
i 



|agr| 

ga ntc = 

ga gtc = 

ga wtc = 

gtn ac = 

gts ac = 

tc acc = 



Tsp45l(21) 
i 

30 Hphl(44) 
i 

t FR3 

! 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 

! T I SRONSKNTLYLQM 

35 I act I ate I TCT { AGA | gac I aac I tct I aag | aat I act I etc I tac I ttg I cag I atg | 51 

! allowed I acn 1 ath | ten | cgn | gay I aay I ten I aar I aay | acn | ttr | tay I ttr I car I atg | 
jagylagr! |agy| jctnj jctnl 

I ga| gac - BsmAI(16) a g ct = 

Alul{23) 



40 ! 



el tec ag = Bpml{19) g ctn age » 

Blpl(21) 
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I I 
I Xbal 1 



g aan nnn ttc = Xmnl(12) 

tg ca = HpyCH4V(21) 



10 



15 



FR3 >! 

106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 
NSLRAEDTAVYYCAK 
I aac I agC I TTA I AGg { get I gag I gac I aCT | GCA | Gtc I tac I tat I tgc | get | aaa | 96 
allowed I aay I ten I ttr | cgn I gen | gar I gay | acn j gen | gtn 1 tay I tay I tgy I gen I aar I 
|agy|ctn|agr| I I 

I cc nng g = BsaJI(23) ac ngt = Bst4CI(51) 

aga tct - Bglll(lO) \ ac ngt «= HpyCH4III (51) 

Rga tcY - BstYI(ll) | ac ngt = Taal(51) 

c ayn nnn rtc = Ms II (44) 
eg rye g = BsiEI(23) 
yg gee r - Eael (23) 
eg gee g « EagI (23) 
|g gee = HaeIII(25) 
gag g = Mnll(31) I 
Aflll I I PstI I 
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10 



15 



20 



25 



30 



Table 4: REdaptors, Extenders, and Bridges used for Cleavage and 

Capture of Human Heavy Chains in FR3. 

A: HpyCH4V Probes of actual human HC genes 

!HpyCH4V in FR3 of human HC, bases 35-56; only those with TGca site 

TGca;10, 

RE recognition: tgca of length 4 is expected at 

10 

1 6-1 agttctccctgcagctgaactc 

2 3-11, 3-07, 3-21, 3-72,3-48 cactgtatctgcaaatgaacag 

3 3-09, 3-43, 3-20 ccctgtatctgcaaatgaacag 

4 5-51 ccgcctacctgcagtggagcag 

5 3-15, 3-30, 3-30. 5, 3-30. 3, 3-74, 3-23, 3-33 cgctgtatctgcaaatgaacag 

6 7-4.1 cggcatatctgcagatctgcag 

7 3-73 cggcgtatctgcaaatgaacag 

8 5-a ctgcctacctgcagtggagcag 

9 3-4 9 tcgcctatctgcaaatgaacag 
B: HpyCH4V REdaptors, Extenders, and Bridges ~ =— — — 

B.l REdaptors 
! Cutting HC lower strand: 

! TmKeller for 100 mM NaCI, zero formamide 
! Edapters for cleavage 



(ON_HCFR36-l) 

(0N_HCFR36-1A) 

(0N_HCFR36-1B) 

(ON_HCFR33-15) 

(ON_HCFR33-15A) 

(ON_HCFR33-15B) 

<ON_HCFR33-ll) 

(ON HCFR35-51) 



5 ' -agttctcccTGCAgctgaactc-3 * 
5 1 -ttctcccTGCAgctgaactc-3 ■ 
5 1 -ttctcccTGCAgctgaac-3 1 

5 ' -cgctgtatcTGCAaatgaacag-3 f 
5 1 -ctgtatcTGCAaatgaacag-3 ■ 
5 ' -ctgtatcTGCAaatgaac-3 ' 

5 1 -cactgtatcTGCAaatgaacag-3 1 

5 1 -ccgcctaccTGCAgtggagcag-3 1 



68.0 
62.0 
56.0 
64.0 
56.0 
50.0 
62.0 
74.0 



64.5 
62.5 
59.9 
60.8 
56.3 
53.1 
58.9 
70.1 



35 



B.2 Segment of synthetic 3-23 gene into which captured CDR3 is to 
be cloned 

! Xbal... 

!D323* cgCttcacTaag tcT aoa gac aaC tcT aag aaT acT ctC taC 
! scab designed gene 3-23 gene 
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30 



HpyCH4V 

AfHI... 
Ttg caG atg aac aa c TtA aq G 



B.3 Extender and Bridges 

! Extender (bottom strand) : 
i 

(ONJiCHpyExOl ) 5 1 -cAAgTAgAgAgTATTcTTAgAgTTgTcTc^AgAcTTAgTgAAgcg-3 ' 
10 ! ON_HCHpyEx01 is the reverse complement of 

5' -cgCttcacTaag tcT aqa gac aaC tcT aag aaT acT etc taC Ttg -3' 

Bridges (top strand, 9-base overlap) : 

15 (ON_HCHpyBr016-l) 5' -cgCttcacTaag tcT aqa gac aaC tcT aag- 

aaT acT etc taC Ttg CAgctgaac-3' {3 '-term C is 

blocked} 
i 

! 3-15 et al. + 3-11 
20 (ON_HCHpyBr023-15) 5 1 -cgCttcacTaag tcT aqa gac aaC tcT aag- 

aaT acT ctC taC Ttg CAaatgaac-3* ( 3' -term C is 

blocked} 



! 5-51 

25 (ON_HCHpyBr04 5-51) 5 1 -cgCttcacTaag tcT aqa gac aaC tcT aag- 

aaT acT ctC taC Ttg CAgtggagc-3 1 {3* -term C is 

blocked} 



PCR primer (top strand) 

(ON_HCHpyPCR) 5 1 -cgCttcacTaag tcT aqa gac-3 f 

j 

C: BlpI Probes from human HC GLGs 

1 1-58, 1-03, 1-08, 1-69, 1-24, 1-45, 1-4 6, 1-f , 1-e 
35 acatggaGCTGAGCagcctgag 

2 1-02 
acatggaGCTGAGCaggctgag 



WO 02/083872 



PCT/US02/12405 



- 90 - 



3 1-18 
acatggagctgaggagcctgag 

4 5-51, 5-a 
acctgcagtggagcagcctgaa 

5 5 3-15,3-73,3-49,3-72 

atctgcaaatgaacagcctgaa 

6 3303, 3-33, 3-07, 3-11, 3-30, 3-21,3-23, 3305, 3-48 
atctgcaaatgaacagcctgag 

7 3-20,3-74,3-09,3-43 
10 atctgcaaatgaacagtctgag 

8 74.1 
atctgcagatctgcagcctaaa 

9 3-66, 3-13, 3-53, 3-d 
atcttcaaatgaacagcctgag 

15 10 3_ 64 

atcttcaaatgggcagcctgag 

11 4301,4-28,4302,4-04,4304,4-31,4-34,4-39,4-59,4-61, 4-b 
ccctgaaGCTGAGCtctgtgac 

20 ccctgcagctgaactctgtgac 

!3 2-70,2-05 
tccttacaatgaccaacatgga 

14 2-26 
tccttaccatgaccaacatgga 

25 D: Blpl REdaptors, Extenders, and Bridges 



D.l REdaptors 

T a w T m K 

(BlpF3HCl-58) 5'-ac atg gaG CTG AGC age ctg ag-3' 70 66. 



30 <BlpF3HC6-l) 5'-cc ctg aag ctg age tct gtg ac-3' 70 
! BlpF3HC6-l matches 4-30.1, not 6-1. 



4 

66. 
4 



D.2 Segment of synthetic 3-23 gene into which captured CDR3 is to 
be cloned 

35 ! 

Blpl 

! Xbal... 
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!D323* cgCttcacTaag TCT AGA gac aaC tcT aag aaT acT etc taC Ttg 
caG atg aac 

Aflll. . . 
acr C TTA AG G 

D.3 Extender and Bridges 

! Bridges 

(BlpF3Brl) S'-cgCttcacTcag tcT aga gaT aaC AGT aaA aaT acT TtG- 

taC Ttg caG Ctg a|GC age ctg-3' 
10 (BlpF3Br2) 5 1 -cgCttcacTcag tcT aga gaT aaC AGT aaA aaT acT TtG- 

taC Ttg caG Ctg ajgc tct gtg-3' 
! I lower strand is cut here 

! Extender 

(BlpF3Ext ) 5 ' - Tc AacTgc AAgTAcAAAqTATTTTTAcTgTTATc TcTAa A cTa AaTa AAaca - 
15 3« 

! BlpF3Ext is the reverse complement of: 

! S'-cgCttcacTcag tcT aga gaT aaC AGT aaA aaT acT TtG taC Ttg caG 

Ctg a-3' 

i 

20 (BlpF3PCR) 5 f -cgCttcacTcag tcT aga gaT aaC-3' 



E: HpyCH4lIl Distinct GLG sequences surrounding site, bases 77-98 

1 102*1, 118*4, 146*7, 169#9,le#10, 311*17, 353*30,404*37, 4301 
ccgtgtattactgtgcgagaga 

2 103#2, 307115, 321#21, 3303*24 , 333#26, 348#28, 364#31, 366#32 
25 ctgtgtattactgtgcgagaga 

3 108#3 
ccgtgtattactgtgcgagagg 

4 124S5,lf§ll 
ccgtgtattactgtgcaacaga 

30 5 14516 

ccatgtattactgtgcaagata 

6 158#8 
ccgtgtattactgtgcggcaga 

7 205#12 
35 ccacatattactgtgcacacag 

8 226*13 
ccacatattactgtgcacggat 
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10 



15 



20 



25 



30 



35 



40 



ccacqtattactqtqcacggat 
10 

ccttgtattactgtgcaaaaga 
11 

ctgtgtattactgtgcaagaga 
12 

ccgtgtattactgtaccacaga 
13 

ccttgtatcactgtgcgagaga 
14 

ccgtatattactgtgcgaaaga 
15 

ctgtgtattactgtgcgaaaga 
16 

ccgtgtattactgtactagaga 
17 

ccgtgtattactgtgctagaga 
18 

ccgtgtattactgtactagaca 
19 

ctgtgtattactgtaagaaaga 
20 

ccgtgtattactgtgcgagaaa 
21 

ccgtgtattactgtgccagaga 
22 

ctgtgtattactgtgcgagaca 
23 

ccatgtattactgtgcgagaca 
24 

ccatgtattactgtgcgaga 



270#14 
309#16, 343#27 
313#18,374#35, 6l#50 
315119 
320#20 
323#22 
330*23, 3305#25 
349#29 
372#33 
373#34 
3d#36 
428#38 
4302#40,43O4#41 
439844 
551#48 
5a#49 



F: HpyCH4III RE dap tor s , Extenders, and Bridges 
F.l REdaptors 

ONs for cleavage of HC (lower) in FR3 (bases 77-97) 
For cleavage with HpyCH4III, Bst4CI, or Taal 
cleavage is in lower chain before base 88. 

77 788 888 888 889 999 999 9 

78 901 234 567 890 123 456 7 



45 



tp K 

<H43.77.97.1-02#1) 

(H43.77.97.1-03#2) 

(H43.77.97.108#3) 

(H43.77.97.323#22) 

<H43.77.97.330#23) 



5 f -cc gtg tat tAC TGT gcg aga g-3' 6462.6 

5 f -ci gtg tat tAC TGT gcg aga g-3 f 6260.6 

5 f -cc gtg tat tAC TGT gcg aga g-3 1 6462.6 

5'-cc gti tat tac tgt gcg a|a g-3 f 6058.7 

S'-qf gtg tat tac tgt gcg a|a g-3' 6058.7 
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(H43.77.97.439#44) 5 ' -cf gtg tat tac tgt gcg aga 1-3' 6260.6 

(H43.77.97.551#48) 5 ' -cc itg tat tac tgt gcg aga %-3 % 6260.6 

(H43.77.97.5a#49) 5 ■ -cc |tg tat tAC TGT gcg aga f-3' 5858.3 

F.2 Extender and Bridges 

5 ! Xbal and Aflll sites in bridges are bunged 
(H43.XABrl) 5 ' -ggtgtagtga- 

| TCT | AGt I gac | aac | tct I aag | aat | act | etc I tac 1 1 tg | cag | a tg 1 - 
| aac | agC I TTt I AGcr I get 1 gag I gac I aCT I GCA I Gt c I tac 1 tat tgt gcg aga-3 1 
(H43.XABr2) 5 1 -ggtgtagtga- 
10 | TCT | AGt | gac | aac | tct | aag | aat | act I etc | tac I ttg | cag | a tg I - 

I aac I a gC 1 TTt 1 AGq 1 get I gag I gac 1 aCT I GCA I Gt c I tac I tat tgt gcg aaa-3* 
(H43.XAExt) 5 1 -ATAgTAgAcT gcAgTgTccT cAgcccTTAA gcTgTTcATc 
TgcAAgTAgA- 

gAgTATTcTT AgAgTTgTcT cTAgATcAcT AcAcc-3' 
15 !H43.XAExt is the reverse complement of 
1 S'-ggtgtagtga- 

! | TCT I AGA) gac | aac | tct | aag | aat | act | etc | tac | ttg | cag | atg | - 
! | aac | a crC 1 TTA I AGg I get I gag I gac I aCT I GCA I Gtc i tac I tat -3 1 

(H43.XAPCR) 5'-ggtgtagtga I TCT I AGA | gac f aac- 3 ' 

20 ! Xbal and Aflll sites in bridges are bunged 
(H43.ABrl) 5'-ggtgtagtga- 

| aac 1 a gC I TTt \ AGg 1 get I gag 1 gac I aCT i GCA I Gtc I tac 1 tat tgt gcg aga-3' 
(H43.ABr2) 5 1 -ggtgtagtga- 

| aac | agC | TTt 1 AGg \ get 1 gag I gac I aCT 1 GCA 1 Gtc I tac 1 tat tgt gcg aaa-3 1 

25 ( H4 3 . AExt ) 5 1 -ATAgTAgAcTgcAgTgTccTcAgcccTTAAgcTgTTTcAcTAcAcc- 3 1 
! (H43.AExt) is the reverse complement of 5 • -ggtgtagtga- 
! I aac I a gC I TTA I AGg I get I gag I gac I aCT I GCA1 Gtc 1 tac I tat -3 1 
(H4 3.APCR) 5'-ggtgtagtga I aac I a cC I TTA 1 AGg 1 get i g-3 1 
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Table 5D: 

Analysis repeated using only 8 best REdaptors 
Id Ntot 01234567 8 + 
5 1 301 78 101 54 32 16 9 10 1 0 281 102#1 
ccgtgtattactgtgcgagaga 

2 493 69 155 125 73 37 14 11 3 6 459 103#2 
ctgtgtattactgtgcgagaga 

3 189 52 45 38 23 18 5 4 1 3 176 108#3 
10 ccgtgtattactgtgcgagagg 

4 127 29 23 28 24 10 6 5 2 0. 114 323#22 
ccgtatattactgtgcgaaaga 

5 78 21 25 14 11 1 4 2 0 0 72 330#23 

ctgtgtattactgtgcgaaaga 6 79 15 17 25 8 11 1 2 0 0 76 
15 439#44 ctgtgtattactgtgcgagaca 

7 43 14 15 5 5 3 0 1 0 0 42 551#48 
ccatgtattactgtgcgagaca 

8 307 26 63 72 51 38 24 14 13 6 250 5a#49 
ccatgtattactgtgcgaga 

20 1 102#1 ccgtgtattactgtgcgagaga ccgtgtattactgtgcgagaga 



2 103#2 ctgtgtattactgtgcgagaga . t 

3 108#3 ccgtgtattactgtgcgagagg g 

4 323#22 ccgtatattactgtgcgaaaga ....a a... 

5 330#23 ctgtgtattactgtgcgaaaga . t a . . . 

25 6 439#44 ctgtgtattactgtgcgagaca .t c. 

7 551#48 ccatgtattactgtgcgagaca ..a c. 

8 5a#49 ccatgtattactgtgcgagaAA ..a AA 

Seqs with the expected RE site only 1463 / 1617 

Seqs with only an unexpected site 0 

30 Seqs with both expected and unexpected — . 7 
Seqs with no sites 0 
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Table 6: Human HC GLG FR1 Sequences 

VH Exon - Nucleotide sequence alignment 

VH1 



1-02 


CAG 


GTG 


CAG 


CTG 


GTG 


CAG 


TCT 


GGG 


GCT 


GAG 


GTG AAG AAG 


CCT 


GGG GCC 


TCA GTG AAG GTC 




TCC 


TGC 


AAG 


GCT 


TCT 


GGA 


TAC 


ACC 


TTC 


ACC 










1-03 


cag 


gtC 


cag 


ctT 


gtg 


cag 


tct 


ggg 


get 


gag 


gtg aag aag 


cct 


ggg gec 


tea gtg aag gtT 




tec 


tgc 


aag 


get 


tct 


gga 


tac 


acc 


ttc 


acT 










1-08 


cag 


gtg 


cag 


ctg 


gtg 


cag 


tct 


ggg 


get 


gag 


gtg aag aag cct 


ggg gec 


tea gtg aag gtc 




tec 


tgc 


aag 


get 


tct 


gga 


tac 


acc 


ttc 


acc 










1-18 


cag 


gtT 


cag 


ctg 


gtg 


cag 


tct 


ggA 


get 


gag 


gtg aag aag 


cct 


ggg gec 


tea gtg aag gtc 




tec 


tgc 


aag 


get 


tct 


ggT 


tac 


acc 


ttT 


acc 










1-24 


cag 


gtC 


cag 


ctg 


gtA 


cag 


tct 


ggg 


get 


gag 


gtg aag aag 


cct 


ggg gec 


tea gtg aag gtc 




tec 


tgc 


aag 


gTt 


tcC 


gga. 


tac 


acc 


Ctc 


acT 










1-45 


cag 


Atg 


cag 


ctg 


gtg 


cag 


tct 


ggg 


get 


gag 


gtg aag aag 


Act 


ggg Tec 


tea gtg aag gtT 




tec 


tgc 


aag 


get 


tcC 


gga 


tac 


acc 


ttc 


acc 










1-46 


cag 


gtg 


cag 


ctg 


gtg 


cag 


tct 


ggg 


get 


gag 


gtg aag aag 


cct 


ggg gee tea gtg aag gtT 




tec 


tgc 


aag 


gcA 


tct 


gga 


tac 


acc 


ttc 


acc 










1-58 


caA 


Atg 


cag 


ctg 


gtg 


cag 


tct 


ggg 


Cct 


gag 


gtg aag aag 


cct 


ggg Acc 


tea gtg aag gtc 




tec 


tgc 


aag 


get 


tct 


gga 


tTc 


acc 


ttT 


acT 










1-69 


cag 


gtg 


cag 


ctg 


gtg 


cag 


tct 


ggg 


get 


gag 


gtg aag aag 


cct 


ggg Tec 


tcG gtg aag gtc 




tec 


tgc 


aag 


get 


tct 


gga 


GGc 


acc 


ttc 


aGc 










1-e 


cag 


gtg 


cag 


ctg 


gtg 


cag 


tct 


ggg 


get 


gag 


gtg aag aag cct ggg Tec tcG gtg aag gtc 




tec 


tgc 


aag 


get 


tct 


gga 


GGc 


acc 


ttc 


aGc 










1-f 


Gag 


gtC 


cag 


ctg 


gtA 


cag 


tct 


ggg 


get 


gag 


gtg aag aag 


cct 


ggg gcT 


Aca gtg aaA Ate 




tec 


tgc 


aag 


gTt 


tct 


gga 


tac 


acc 


ttc 


acc 










VH2 






























2-05 


CAG 


ATC 


ACC 


TTG 


AAG 


GAG 


TCT 


GGT 


CCT 


ACG 


CTG GTG AAA 


CCC 


ACA CAG 


ACC CTC ACG CTG 




ACC 


TGC 


ACC 


TTC 


TCT 


GGG 


TTC 


TCA 


CTC 


AGC 










2-26 


cag 


Gtc 


acc 


ttg 


aag 


gag 


tct 


ggt 


cct 


GTg 


ctg gtg aaa 


ccc 


aca Gag 


acc ctc acg ctg 




acc 


tgc 


acc 


Gtc 


tct 


ggg 


ttc 


tea 


ctc 


age 










2-70 


cag 


Gtc 


acc 


ttg 


aag 


gag 


tct 


ggt 


cct 


Gcg 


ctg gtg aaa 


CCC 


aca cag 


acc ctc acA ctg 




acc 


tgc 


acc 


ttc 


tct 


ggg 


ttc 


tea 


ctc 


age 










VH3 






























3-07 


GAG 


GTG 


CAG 


CTG 


GTG 


GAG 


TCT 


GGG 


GGA 


GGC 


TTG GTC CAG 


CCT 


GGG GGG 


TCC CTG AGA CTC 




TCC 


TGT 


GCA 


GCC 


TCT 


GGA 


TTC 


ACC 


TTT 


AGT 










3-09 


gaA 


gtg 


cag 


ctg 


gtg 


gag 


tct 


ggg 


gga 


ggc 


ttg gtA cag 


cct 


ggC Agg 


tec ctg aga ctc 




tec 


tgt 


gca 


gee 


tct 


gga 


ttc 


acc 


ttt 


GAt 










3-11 


Cag 


gtg 


cag 


ctg 


gtg 


gag 


tct 


ggg 


gga 


ggc 


ttg gtc Aag 


cct 


ggA ggg 


tec ctg aga ctc 




tec 


tgt 


gca 


gee 


tct 


gga 


ttc 


acc 


ttc 


agt 










3-13 


gag 


gtg 


cag 


ctg 


gtg 


gag 


tct 


ggg 


gga 


ggc 


ttg gtA cag cct 


ggg ggg 


tec ctg aga ctc 




tec 


tgt 


gca 


gee 


tct 


gga 


ttc 


acc 


ttc 


agt 










3-15 


gag 


gtg 


cag 


ctg 


gtg 


gag 


tct 


ggg 


gga 


ggc 


ttg gtA Aag 


cct 


ggg ggg 


tec ctT aga ctc 




tec 


tgt 


gca 


gee 


tct 


gga 


ttc 


acT 


ttc 


agt 
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3-20 gag gtg cag ctg gtg gag tct ggg gga ggT Gtg gtA cGg cct ggg ggg tec ctg aga etc 

tec tgt gca gec tct gga ttc acc ttt GAt 
3-21 gag gtg cag ctg gtg gag tct ggg gga ggc Ctg gtc Aag cct ggg ggg tec ctg aga etc 

tec tgt gca gec tct gga ttc acc ttc agt 
5 3-23 gag gtg cag ctg Ttg gag tct ggg gga ggc ttg gtA cag cct ggg ggg tec ctg aga etc 

tec tgt gca gec tct gga ttc acc ttt agC 
3-30 Cag gtg cag ctg gtg gag tct ggg gga ggc Gtg gtc cag cct ggg Agg tec ctg aga etc 

tec tgt gca gee tct gga ttc acc ttc agt 
3-30,3 Cag gtg cag ctg gtg gag tct ggg gga ggc Gtg gtc cag cct ggg Agg tec ctg aga etc 
10 tec tgt gca gee tct gga ttc acc ttC agt 

3-30.5 Cag gtg cag ctg gtg gag tct ggg gga ggc Gtg gtc cag cct ggg Agg tec ctg aga etc 

tec tgt gca gee tct gga ttc acc ttC agt 
3-33 Cag gtg cag ctg gtg gag tct ggg gga ggc Gtg gtc cag cct ggg Agg tec ctg aga etc 

tec tgt gca gcG tct gga ttc acc ttC agt 
15 3-43 gaA gtg cag ctg gtg gag tct ggg gga gTc Gtg gtA cag cct ggg ggg tec ctg aga etc 

tec tgt gca gee tct gga ttc acc ttt GAt 
3-48 gag gtg cag ctg gtg gag tct ggg gga ggc ttg gtA cag cct ggg ggg tec ctg aga etc 

tec tgt gca gee tct gga ttc acc ttC agt 
3-4 9 gag gtg cag ctg gtg gag tct ggg gga ggc ttg gtA cag ccA ggg Cgg tec ctg aga etc 
20 tec tgt Aca gcT tct gga ttc acc ttt Ggt 

3-53 gag gtg cag ctg gtg gag Act ggA gga ggc ttg Ate cag cct ggg ggg tec ctg aga etc 

tec tgt gca gee tct ggG ttc acc GtC agt 
3-64 gag gtg cag ctg gtg gag tct ggg gga ggc ttg gtc cag cct ggg ggg tec ctg aga etc 

tec tgt gca gee tct gga ttc acc ttC agt 
25 3-66 gag gtg cag ctg gtg gag tct ggg gga ggc ttg gtc cag cct ggg ggg tec ctg aga etc 

tec tgt gca gec tct gga ttc acc GtC agt 
3-72 gag gtg cag ctg gtg gag tct ggg gga ggc ttg gtc cag cct ggA ggg tec ctg aga etc 

tec tgt gca gee tct gga ttc acc ttC agt 
3-73 gag gtg cag ctg gtg gag tct ggg gga ggc ttg gtc cag cct ggg ggg tec ctg aAa etc 
30 tec tgt gca gee tct ggG ttc acc ttC agt 

3-74 gag gtg cag ctg gtg gag tcC ggg gga ggc ttA gtT cag cct ggg ggg tec ctg aga etc 

tec tgt gca gee tct gga ttc acc ttC agt 

3- d gag gtg cag ctg gtg gag tct Cgg gga gTc ttg gtA cag cct ggg ggg tec ctg aga etc 

tec tgt gca gec tct gga ttc acc GtC agt 

35 VH4 

4- 04 CAG GTG CAG CTG CAG GAG TCG GGC CCA GGA CTG GTG AAG CCT TCG GGG ACC CTG TCC CTC 

ACC TGC GCT GTC TCT GGT GGC TCC ATC AGC 
4-28 cag gtg cag ctg cag gag teg ggc cca gga ctg gtg aag cct teg gAC acc ctg tec etc 

acc tgc get gtc tct ggt TAc tec ate age 
40 4-30.1 cag gtg cag ctg cag gag teg ggc cca gga ctg gtg aag cct tcA CAg acc ctg tec etc 

acc tgc Act gtc tct ggt ggc tec ate age 
4-30.2 cag Ctg cag ctg cag gag tcC ggc Tea gga ctg gtg aag cct tcA CAg acc ctg tec etc 

acc tgc get gtc tct ggt ggc tec ate age 
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4-30.4 


cag 


gtg cag ctg 


cag 


gag 


teg 


ggc 


cca 


gga 


ctg gtg aag cct tcA CAg acc ctg tec etc 




acc 


tgc Act gtc 


tct 


ggt 


ggc 


tec 


ate 


age 








4-31 


cag 


gtg cag ctg 


cag 


gag 


teg ggc 


cca 


gga 


ctg gtg aag cct tcA CAg acc ctg tec etc 




acc 


tgc Act gtc 


tct ggt 


ggc 


tec 


ate 


age 








4-34 


cag 


gtg cag ctA 


cag 


Cag 


tGg 


ggc 


Gca 


gga 


ctg Ttg aag cct teg gAg 


acc ctg tec 


etc 




acc 


tgc get gtc 


tAt 


ggt 


ggG 


tec 


Ttc 


agT 








4-39 


cag 


Ctg cag ctg 


cag 


gag 


teg 


ggc 


cca 


gga 


ctg gtg aag cct teg gAg 


acc ctg tec 


etc 




acc 


tgc Act gtc 


tct ggt 


ggc 


tec 


ate 


age 








4-59 


cag 


gtg cag ctg 


cag 


gag 


teg 


ggc 


cca 


gga 


ctg gtg aag cct teg gAg 


acc ctg tec 


etc 




acc 


tgc Act gtc 


tct 


ggt 


ggc 


tec 


ate 


agT 








4-61 


cag 


gtg cag ctg 


cag 


gag 


teg 


ggc 


cca 


gga 


ctg gtg aag cct teg gAg 


acc ctg tec 


etc 




acc 


tgc Act gtc 


tct 


ggt 


ggc 


tec 


Gtc 


age 








4-b 


cag 


gtg cag ctg 


cag 


gag 


teg 


ggc 


cca 


gga 


ctg gtg aag cct teg gAg 


acc ctg tec 


etc 




acc 


tgc get gtc 


tct 


ggt 


TAc 


tec 


ate 


aac 








VH5 
























5-51 


GAG 


GTG CAG CTG 


GTG 


CAG 


TCT 


GGA 


GCA 


GAG 


GTG AAA AAG CCC GGG GAG 


TCT CTG AAG 


ATC 




TCC 


TGT AAG GGT 


TCT 


GGA 


TAC 


AGC 


TTT 


ACC 








5-a 


gaA 


gtg cag ctg 


gtg 


cag 


tct 


gga 


gca 


gag 


gtg aaa aag ccc ggg gag 


tct ctg aGg 


ate 




tec 


tgt aag ggt 


tct 


gga 


tac 


age 


ttt 


acc 








VH6 
























6-1 


CAG 


GTA CAG CTG 


CAG 


CAG 


TCA 


GGT 


CCA 


GGA 


CTG GTG AAG CCC TCG CAG 


ACC CTC TCA 


CTC 




ACC 


TGT GCC ATC 


TCC 


GGG 


GAC 


AGT 


GTC 


TCT 








VH7 
























7-4.1 


CAG 


GTG CAG CTG 


GTG 


CAA 


TCT 


GGG 


TCT 


GAG 


TTG AAG AAG CCT GGG GCC 


TCA GTG AAG 


GTT 




TCC 


TGC AAG GCT 


TCT 


GGA 


TAC 


ACC 


TTC 


ACT 
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Table 7: RERS sites in Human HC GLG FRls where there are at least 20 GLGs cut 
Bsgl GTGCAG 71 (cuts 16/14 bases to right) 



1: 


4 


1: 


13 


2: 


13 


3: 


4 


3: 


13 


4 : 


13 


6: 


13 


7: 


4 


7: 


13 


8: 


13 


9: 


4 


• 9: 


13 


10: 


4 


10: 


13 


15: 


4 


15: 


65 


16: 


4 


16: 


65 


17: 


4 


17: 


65 


18: 


4 


18: 


65 


19: 


4 


19: 


65 


20: 


4 


20: 


65 


21: 


4 


21: 


65 


22: 


4 


22: 


65 


23: 


4 


23: 


65 


24: 


4 


24: 


65 


25: 


4 


25: 


65 


26: 


4 


26: 


65 


27: 


4 


27: 


65 


28: 


4 


28: 


65 


29: 


4 


30: 


4 


30: 


65 


31: 


4 


31: 


65 


32: 


4 


32: 


65 


33: 


4 


33: 


65 


34: 


4 


34: 


65 


35: 


4 


35: 


65 


36: 


4 


36: 


65 


37: 


4 


38: 


4 


39: 


4 


41: 


4 


42: 


4 


43: 


4 


45: 


4 


46: 


4 


47: 


4 


48: 


4 


48: 


13 


49: 


4 


49: 


13 


51: 


4 







15 There are 39 hits at base# 4 

There are 21 hits at base# 65 



20 





ctgcac 










9 










12: 


63 


13: 


63 


14: 


63 


39: 


63 


41: 


63 


42: 


63 


44: 


63 


45: 


63 


46: 


63 














Bbvl 


GCAGC 










65 










1: 


6 


3: 


6 


6: 


6 


7: 


6 


8: 


6 


9: 


6 


10: 


6 


15: 


6 


15: 


67 


16: 


6 


16: 


67 


17: 


6 


17: 


67 


18: 


6 


18: 


67 


19: 


6 


19: 


67 


20: 


6 


20: 


67 


21: 


6 


21: 


67 


22: 


6 


22: 


67 


23: 


6 


23: 


67 


24: 


6 


24: 


67 


25: 


6 


25: 


67 


26: 


6 


26: 


67 


27: 


6 


27: 


67 


28: 


6 


28: 


67 


29: 


6 


30: 


6 


30: 


67 


31: 


6 


31: 


67 


32: 


6 


32: 


67 


33: 


6 


33: 


67 


34: 


6 


34: 


67 


35: 


6 


35: 


67 


36: 


6 


36: 


67 


37: 


6 


38: 


6 


39: 


6 


40: 


6 


41: 


6 


42: 


6 


43: 


6 


44: 


6 


45: 


6 


46: 


6 


47: 


6 


48: 


6 


49: 


6 


50: 


12 


51: 


6 







There are 43 hits at base# 6 Bolded sites very near sites 

listed below 

35 There are 21 hits at base# 67 

gctgc 13 
37: 9 38: 9 39: 9 40: 3 40: 9 41: 9 
42: 9 44: 3 44: 9 45: 9 46: 9 47: 9 
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50: 9 

There are 11 hits at: ba.se# 9 



BsoFI GCngc 78 



1: 


6 


3: 


6 


6: 


6 


7: 


6 


8: 


6 


9: 


6 


10: 


6 


15: 


6 


15: 


67 


16: 


6 


16: 


67 


17: 


6 


17: 


67 


18: 


6 


18: 


67 


19: 


6 


19: 


67 


20: 


6 


20: 


67 


21: 


6 


21: 


67 


22: 


6 


22: 


67 


23: 


6 


23: 


67 


24: 


6 


24: 


67 


25: 


6 


25: 


67 


26: 


6 


26: 


67 


27: 


6 


27: 


67 


28: 


6 


28: 


67 


29: 


6 


30: 


6 


30: 


67 


31: 


6 


31: 


67 


32: 


6 


32: 


67 


33: 


6 


33: 


67 


34: 


6 


34: 


67 


35: 


6 


35: 


67 


36: 


6 


36: 


67 


37: 


6 


37: 


9 


38: 


6 


38: 


9 


39: 


6 


39: 


9 


40: 


3 


40: 




40: 




41: 


6 


41: 


9 


42: 


6 


42: 


9 


43: 


6 


44 : 


3 


44: 


6 


44: 


9 


45: 


6 


45: 


9 


46: 




46: 


9 


47: 


6 


47: 


9 


48: 


6 


49: 


6 


50: 


9 


50: 


12 


51: 


6 



There are 43 hits at base# 6 These often occur together. 

There are 11 hits at base# 9 

20 There are 2 hits at base# 3 

There are 21 hits at base# 67 



Tsel Gcwgc 78 



1: 


6 


3: 


6 


6: 


6 


7: 


6 


8: 


6 


9: 


6 


10: 


6 


15: 


6 


15: 


67 


16: 


6 


16: 


67 


17: 


6 


17: 


67 


18: 


6 


18: 


67 


19: 


6 


19: 


67 


20: 


6 


20: 


67 


21: 


6 


21: 


67 


22: 


6 


22: 


67 


23: 


6 


23: 


67 


24: 


6 


24: 


67 


25: 


6 


25: 


67 


26: 


6 


26: 


67 


27: 


6 


27: 


67 


28: 


6 


28: 


67 


29: 


6 


30: 


6 


30: 


67 


31: 


6 


31: 


67 


32: 


6 


32: 


67 


33: 


6 


33: 


67 


34: 


6 


34: 


67 


35: 


6 


35: 


67 


36: 


6 


36: 


67 


37: 


6 


37: 


9 


38: 


6 


38: 


9 


39: 


6 


39: 


9 


40: 


3 


40: 


6 


40: 


9 


41: 


6 


41: 


9 


42: 


6 


42: 


9 


43: 


6 


44: 


3 


44: 


6 


44: 


9 


45: 


6 


45: 


9 


46: 


6 


46: 


9 


47: 


6 


47: 


9 


48: 


6 


49: 


6 


50: 


9 


50: 


12 


51: 


6 



There are 43 hits at base# 6 Often together. 



There are 11 hits at base# 9 
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There are 2 hits at base# 3 
There are 1 hits at base# 12 

There are 21 hits at basei 67 



10 



15 



20 



25 



MspAlI CMGckg 
1:7 3 
8: 7 

17: 7 

23: 7 

29: 7 

35: 7 

40: 7 



48 



9 
18 
24 
30 
36 
41 
47 



46: 7 
There are 46 

PvuII CAGctg 



4: 
10: 
19: 
25: 
31: 
37: 
42: 
48: 



7 
7 
7 
7 
7 
7 
7 
7 



5 
11 
20 
26 
32 
38 
44 



49 



hits at base# 7 



48 



1 
8 
17 
23 
29 
35 
_40 



46 



7 
7 
7 
7 
7 
7 
7 



3 
9 
18 
24 
30 
36 
41 
47 



7 
7 
7 
7 
7 
7 
7 
7 



4 

10 
19 
25 
31 
37 
42 
48 



7 
7 
7 
7 
7 
7 
7 
7 



5 
11 
20 
26 
32 
38 
44 



49 



7 
7 
7 
7 
7 
7 
-1- 



There are 46 
There are 2 



hits at base# 7 
hits at basei 1 



6: 
15: 
21: 
27: 
33: 
39: 
_44j 



50: 



6: 
15: 
21: 
27: 
33: 
39: 
_44J 



50: 



7 
7 
7 
7 
7 
7 
7 



7: 
16: 
22: 
28: 
34: 
40: 
45: 
51: 



7: 
16: 
22: 
28: 
34: 
40: 
45: 
51: 



Alul AGct 



30 



35 



1: 
6: 
15: 
21: 
27: 
32: 
38: 
43: 
48: 



2: 
7: 
16: 
22: 
28: 
33: 
39: 
44: 
48: 



3: 
8: 
17: 
23: 
29: 
34: 
40: 
44 : 



82 49: 



8 
8 
8 
8 
8 
8 

_2_ 
_8 
8 



4 

9 
18 
24 
29 
35 
40 
45: 
49: 



54 

8 
8 
8 
8 

69 
8 
8 
8 

82 



4 : 
10: 
19: 
25: 
30: 
36: 
41: 
46: 
50: 



24 
8 
8 
8 
8 
8 
8 
8 
8 



5: 
11: 
20: 
26: 
31: 
37: 
42: 
47: 
51: 
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There are 48 hits at base# 8 

There are 2 hits at base# 2 

Ddel Ctnag 48 

5 1: 26 1: 48 2: 26 2: 48 3: 26 3: 48 

4: 26 4: 48 5: 26 5: 48 6: 26 6: 48 

7: 26 7: 48 8: 26 8: 48 9: 26 10: 26 

11: 26 12: 85 13: 85 14: 85 15: 52 16: 52 

17: 52 18: 52 19: 52 20: 52 21: 52 22: 52 

10 23: 52 24: 52 25: 52 26: 52 27: 52 28: 52 

29: 52 30: 52 31: 52 32: 52 33: 52 35: 30 

35: 52 36: 52 40: 24 49: 52 51: 26 51: 48 

. . There are 22 hits at base# 52 52 and 48 never together. 

There are 9 hits at base# 48 

15 There are 12 hits at base# 26 26 and 24 never together. 

HphI tcacc 42 

1: 86 3: 86 6: 86 7: 86 8: 80 11: 86 

12: 5 13: 5 14: 5 15: 80 16: 80 17: 80 

20 18: 80 20: 80 21: 80 22: 80 23: 80 24: 80 

25: 80 26: 80 27: 80 28: 80 29: 80 30: 80 

31: 80 32: 80 33: 80 34: 80 35: 80 36: 80 

37: 59 38: 59 39: 59 40: 59 41: 59 42: 59 

43: 59 44: 59 45: 59 46: 59 47: 59 50: 59 

25 There are 22 hits at base# 80 80 and 86 never together 
There are 5 hits at base# 86 
There are 12 hits at baset 59 



30 



35 



BssKI Nccngg 










50 










1: 39 


2: 


39 


3: 


39 


4: 


39 


5: 


39 


7: 


39 


8: 39 


9: 


39 


10: 


39 


11: 


39 


15: 


39 


16: 


39 


17; 39 


18: 


39 


19: 


39 


20: 


39 


21: 


29 


21: 


39 


22: 39 


23: 


39 


24: 


39 


25: 


39 


26: 


39 


27: 


39 


28: 39 


29: 


39 


30: 


39 


31: 


39 


32: 


39 


33: 


39 


34: 39 


35: 


19 


35: 


39 


36: 


39 


37: 


24 


38: 


24 


39: 24 


41: 


24 


42: 


24 


44: 


24 


45: 


24 


46: 


24 


47: 24 


48: 


39 


48: 


40 


49: 


39 


-49: 


40 


50: 


24 


50: 73 


51: 


39 

















There are 35 hits at base# 39 39 and 40 together twice. 
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There are 2 hits at base# 40 



BsaJI Ccnngg 47 





1: 


40 


2 


40 


3: 


40 


4: 


40 


5: 


40 


7: 


40 


5 


8: 


40 


9 


40 


9: 


47 


10: 


40 


10: 


47 


11: 


40 




15: 


40 


18 


40 


19: 


40 


20: 


40 


21: 


40 


22: 


40 




23: 


40 


24 


: 40 


25: 


40 


26: 


40 


27: 


40 


28: 


40 




29: 


40 


30 


: 40 


31: 


40 


32: 


40 


34: 


40 


35: 


20 




35: 


40 


36 


: 40 


37: 


24 


38: 


24 


39: 


24 


41: 


24 


10 


42: 


24 


44 


: 24 


45: 


24 


46: 


24 


47: 


24 


48: 


40 




48: 


41 


49: 40 


49: 


41 


50: 


74 


51: 


40 







There are 32 hits at base# 40 40 and 41 together twice 

There are 2 hits at base# 41 

There are 9 hits at base# 24 

15 There are 2 hits at base# 47 



BstNI CCwgg 44 

PspGI ccwgg 

ScrFI (SM.Hpall) CCwgg 



1: 


40 


2: 


40 


3: 


40 


4: 


40 


5: 


40 


7: 


40 


8: 


40 


9: 


40 


10: 


40 


11: 


40 


15: 


40 


16: 


40 


17: 


40 


18: 


40 


19: 


40 


20: 


40 


21: 


30 


21: 


40 


22: 


40 


23: 


40 


24: 


40 


25: 


40 


26: 


40 


27: 


40 


28: 


40 


29: 


40 


30: 


40 


31: 


40 


32: 


40 


33: 


40 


34: 


40 


35: 


40 


36: 


40 


37: 


25 


38: 


25 


39: 


25 


41: 


25 


42: 


25 


44: 


25 


45: 


25 


46: 


25 


47: 


25 


50: 


25 


51: 


40 



















There are 33 hits at base# 40 



30 ScrFI CCngg 

1: 40 2: 40 

8: 40 9: 40 

17: 40 18: 40 

22: 40 23: 40 

35 28: 40 29: 40 

34: 40 35: 20 

39: 25 41: 25 



50 



3: 


40 


4: 


40 


10: 


40 


11: 


40 


19: 


40 


20: 


40 


24: 


40 


25: 


40 


30: 


40 


31: 


40 


35: 


40 


36: 


40 


42: 


25 


44: 


25 



5: 


40 


7: 


40 


15: 


40 


16: 


40 


21: 


30 


21: 


40 


26: 


40 


27: 


40 


32: 


40 


33: 


40 


37: 


25 


38: 


25 


45: 


25 


46: 


25 
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A j : 4o : io: 

50: 74 51: 40 
There are 35 hits at 
There are 2 hits at 



hi 4*: 4U 

base# 40 
base# 41 



4s: 4 1 



ou: 



10 



15 



20 



25 



35 



EcoO109I RGgnccy 



34 



1: 


43 


2: 


43 


3: 


43 


4: 


43 


5: 


43 


6: 


43 


7: 


43 


8: 


43 


9: 


43 


10: 


43 


15: 


46 


16: 


46 


17: 


46 


18: 


46 


19: 


46 


20: 


46 


21: 


46 


22: 


46 


23: 


46 


24: 


46 


25: 


46 


26: 


46 


27: 


46 


28: 


46 


30: 


46 


31: 


46 


32: 


46 


33: 


46 


34: 


46 


35: 


46 


36: 


46 


37: 


46 


43: 


79 


51: 


43 










There arc 


a 22 hits at base# 46 


46 


and 43 never togethc 


There are 11 hits at 


baselj 


43 












NlalV GGNncc 








71 










1: 


43 


2: 


43 


3: 


43 


4: 


43 


5: 


43 


6: 


43 


7: 


4 3 


8: 


43 


9: 


43 


9: 


79 


10: 


43 


10: 


79 


15: 


46 


15: 


47 


16: 


47 


17: 


46 


17: 


47 


18: 


46 


18: 


47 


19: 


46 


19: 


47 


20: 


46 


20: 


47 


21: 


46 


21: 


47 


22: 


46 


22: 


47 


23: 


47 


24: 


47 


25: 


47 


26: 


47 


27: 


46 


27: 


47 


28: 


46 


28: 


47 


29: 


47 


30: 


46 


30: 


47 


31: 


46 


31: 


47 


32: 


46 


32: 


47 


33: 


46 


33: 


47 


34: 


46 


34: 


47 


35: 


46 


35: 


47 


36: 


46 


36: 


47 


37: 


21 


37: 


46 


37: 


47 


37: 


79 


38: 


21 


39: 


21 


39: 


79 


40: 


79 


41: 


21 


41: 


79 


42: 


21 


42: 


79 


43: 


79 


44: 


21 


44: 


79 


45: 


21 


45: 


79 


46: 


21 


46: 


79 


47: 


21 


51: 


43 







There are 23 hits at base# 47 46 6 47 often together 
There are 17 hits at base# 46 



There are 11 hits at base# 43 



30 Sau96I Ggncc 



70 



1: 


44 


2: 


3 


2: 


44 


3: 


44 


4 : 


44 


5: 


3 


5: 


44 


6: 


44 


7: 


44 


8: 


22 


8: 


44 


9: 


44 


10: 


44 


11: 


3 


12: 


22 


13: 


22 


14: 


22 


15: 


33 


15: 


47 


16: 


47 


17: 


47 


18: 


47 


19: 


47 


20: 


47 


21: 


47 


22: 


47 


23: 


33 


23: 


47 


24: 


33 


24: 


47 


25: 


33 


25: 


47 


26: 


33 


26: 


47 


27: 


47 


28: 


47 


29: 


47 


30: 


47 


31: 


33 


31: 


47 


32: 


33 


32: 


47 


33: 


33 


33: 


47 


34: 


33 


34: 


47 


35: 


47 


36: 


47 


37: 


21 


37: 


22 


37: 


47 


38: 


21 


38: 


22 


39: 


21 


39: 


22 


41: 


21 


41: 


22 


42: 


21 


42: 


22 


43: 


80 


44: 


21 


44: 


22 


45: 


21 


45: 


22 


46: 


21 


46: 


22 


47: 


21 


47: 


22 


50: 


22 


51: 


44 
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There are 23 hits at base# 47 These do not occur together. 

There are 11 hits at base# 44 

There are 14 hits at base# 22 These do occur together. 

There are 9 hits at base# 21 

BsmAI GTCTCNnnnn 22 



1: 


58 


3: 


58 


4: 


58 


5: 


58 


8: 


58 


9: 


58 


10: 


58 


13: 


70 


36: 


18 


37: 


70 


38: 


70 


39: 


70 


40: 


70 


41: 


70 


42: 


70 


44: 


70 


45: 


70 


46: 


70 


47: 


70 


48: 


48 


49: 


48 


50: 


85 











There are 11 hits at base# 70 



Nnnnnngagac 27 





13: 


40 


15: 


48 


16: 


48 


17: 


48 


18: 


48 


20: 


48 


15 


21: 


48 


22: 


48 


23: 


48 


24: 


48 


25: 


48 


26: 


48 




27: 


48 


28: 


48 


29: 


48 


30: 


10 


30: 


48 


31: 


48 




32: 


48 


33: 


48 


35: 


48 


36: 


48 


43: 


40 


44: 


40 




45: 


40 


46: 


40 


47: 


40 















There are 20 hits at base# 48 



20 



Avail Ggwcc 4 4 

Sau96I ($M.HaeIII) Ggwcc 44 



30 



2: 


3 


5: 


3 


6: 


44 


8: 


44 


9: 


44 


10: 


44 


11: 


3 


12: 


22 


13: 


22 


14: 


22 


15: 


33 


15: 


47 


16: 


47 


17: 


47 


18: 


47 


19: 


47 


20: 


47 


21: 


47 


22: 


47 


23: 


33 


23: 


47 


24: 


33 


24: 


47 


25: 


33 


25: 


47 


26: 


33 


26: 


47 


27: 


47 


28: 


47 


29: 


47 


30: 


47 


31: 


33 


31: 


47 


32: 


33 


32: 


47 


33: 


33 


33: 


47 


34: 


33 


34: 


47 


35: 


47 


36: 


47 


37: 


47 


43: 


80 


50: 


22 



















There are 23 hits at base# 47 44 & 47 never together 
There are 4 hits at base# 44 



PpuMI RGgwccy 27 

35 6: 43 8: 43 9: 43 10: 43 15: 46 16: 46 

17: 46 18: 46 19: 46 20: 46 21: 46 22: 46 

23: 46 24: 46 25: 46 26: 46 27: 46 28: 46 
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30: 46 31: 46 32: 46 33: 46 34: 46 35: 46 
36: 46 37: 46 43: 79 

There are 22 hits at base# 46 43 and 46 never occur together. 
There are 4 hits at base# 43 

5 

BsmFI GGGAC 3 



8: 


43 


37: 


46 


50: 


77 












if _ 


gtccc 








33 










15: 


48 


16: 


48 


17: 


48 


1: 0 


1: 


0 


20: 


48 


21: 


48 


22: 


48 


23: 


48 


24: 48 


25: 


48 


26: 


48 


27: 


48 


28: 


48 


29: 


48 


30: 48 


31: 


48 


32: 


48 


33: 


48 


34: 


48 


35: 


48 


36: 48 


37: 


54 


38: 


54 


39: 


54 


40: 


54 


41: 


54 


42: 54 


43: 


54 


44: 


54 


45: 


54 


46: 


54 


47: 


54 













15 There are 20 hits at base# 48 
There are 11 hits at base! 54 



Hinfl Gantc 80 



8: 


77 


12: 


16 


13: 


16 


14: 


16 


15: 


16 


15: 


56 


15: 


77 


16: 


16 


16: 


56 


16: 


77 


17: 


16 


17: 


56 


17: 


77 


18: 


16 


18: 


56 


18: 


77 


19: 


16 


19: 


56 


19: 


77 


20: 


16 


20: 


56 


20: 


77 


21: 


16 


21: 


56 


21: 


77 


22: 


16 


22: 


56 


22: 


77 


23: 


16 


23: 


56 


23: 


77 


24: 


16 


24: 


56 


24: 


77 


25: 


16 


25: 


56 


25: 


77 


26: 


16 


26: 


56 


26: 


77 


27: 


16 


27: 


26 


27: 


56 


27: 


77 


28: 


16 


28: 


56 


28: 


77 


29: 


16 


29: 


56 


29: 


77 


30: 


56 


31: 


16 


31: 


56 


31: 


77 


32: 


16 


32: 


56 


32: 


77 


33: 


16 


33: 


56 


33: 


77 


34: 


16 


35: 


16 


35: 


56 


35: 


77 


36: 


16 


36: 


26 


36: 


56 


36: 


77 


37: 


16 


38: 


16 


39: 


16 


40: 


16 


41: 


16 


42: 


16 


44: 


16 


45: 


16 


46: 


16 


47: 


16 


48: 


46 


49: 


46 



















There are 34 hits at base# 16 



35 Tfil Gawtc 

8: 77 15: 77 

20: 77 21: 77 

26: 77 27: 77 



21 

16: 77 17: 77 
22: 77 23: 77 
28: 77 29: 77 



18: 


77 


19: 


77 


24: 


77 


25: 


77 


31: 


77 


32: 


77 
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10 



33: 77 35: 77 36: 77 
There are 21 hits at base# 77 

Mlyl GAGTC 38 



12: 


16 


13: 


16 


14: 


16 


15: 


16 


16: 


16 


17: 


16 


18: 


16 


19: 


16 


20: 


16 


21: 


16 


22: 


16 


23: 


16 


24: 


16 


25: 


16 


26: 


16 


27: 


16 


27: 


26 


28: 


16 


29: 


16 


31: 


16 


32: 


16 


33: 


16 


34: 


16 


35: 


16 


36: 


16 


36: 


26 


37: 


16 


38: 


16 


39: 


16 


40: 


16 


41: 


16 


42: 


16 


44: 


16 


45: 


16 


46: 


16 


47: 


16 


48: 


46 


49: 


46 



















There are 34 hits at base# 16 



15 



20 



25 



30 



35 



-"- GACTC 
15: 56 16 
21: 56 22 
27: 56 
33: 56 



56 17: 56 
56 23: 56 
28: 56 29: 56 
35: 56 36: 56 
There are 21 hits at base# 56 



21 
18: 56 
24: 56 
30: 56 



Plel gagtc 



38 



19: 56 
25: 56 
31: 56 



There are 34 hits at 

gactc 

15: 56 16: 56 17: 
21: 56 22: 56 23: 
27: 56 28: 56 29: 
33: 56 35: 56 36: 
There are 21 hits at 
AlwNI CAGNNNctg 
15: 68 16: 68 17: 



base# 16 

21 

56 18: 56 
56 24: 56 
56 30: 56 
56 

base# 56 

26 
18: 68 



19: 56 
25: 56 
31: 56 



20: 56 
26: 56 
32: 56 



12: 


16 


13: 


16 


14: 


16 


15: 


16 


16: 


16 


17: 


16 


18: 


16 


19: 


16 


20: 


16 


21: 


16 


22: 


16 


23: 


16 


24 : 


16 


25: 


16 


26: 


16 


27: 


16 


27: 


26 


28: 


16 


29: 


16 


31: 


16 


32: 


16 


33: 


16 


34: 


16 


35: 


16 


36: 


16 


36: 


26 


37: 


16 


38: 


16 


39: 


16 


40: 


16 


41: 


16 


42: 


16 


44: 


16 


45: 


16 


46: 


16 


47: 


16 


48: 


46 


49: 


46 



















20: 56 
26: 56 
32: 56 



68 



19: 68 20: 68 
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21: 68 22: 68 
27: 68 28: 68 
33: 68 34: 68 
41: 46 42: 46 
5 There are 22 h 



23: 68 24: 68 

29: 68 30: 68 

35: 68 36: 68 

at base# 68 



25: 68 26: 68 
31: 68 32: 68 
39: 46 40: 46 
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Table 8: Kappa FR1 GLGs 





1 1 






A 


c 
0 


O 


7 


Q 
O 


Q 


10 


11 


12 






GAC 


A1G 


GAG 


ATP 


APP 
Abb 


PAP 
bnb 


TPT 
l b l 


PPA 
bbn 


TCC 


TCC 


CTG 


TCT 






! 1 J 


1*1 


ID 


X 0 




10 


1 Q 

i 2? 


90 


21 


22 


23 






c 
O 


GCA 


111 


OTA 

Gl A 


OO A 

oGA 


pap 
GAL. 


APA 


PTP 
bib 


APP 

Abb 


ATP 
n i v» 


APT 
n^» x 


TGC 


j 


012 




GAC 


A f O 

ATC 


O AO 

GAG 


7ATO 

AiG 


AGG 


pap 
bAb 


TPT 
I b 1 


PPA 
bbn 


TPP 

X bb 


TPP 


PTP 


TPT 






GCA 


TCT 


Of A 

GTA 


OO A 

GGA 


O A O 

GAG 


AGA 


Off 

GIG 


app 
Abb 


ATP 
Aib 


APT 
Abi 


TPP 
x bb 




02 




GAC 


ATC 


O AO 

CAG 


A f O 

AIG 


A OO 

AGG 


o A O 

GAG 


f Of 
1 CI 


PPA 
GGA 


TPP 
ibb 


TPP 

i bb 


PTP 
bib 


TPT 

X b X 






GCA 


TCT 


Of A 

GTA 


O O A 

GGA 


O A O 

GAG 


apa 
AGA 


PTP 

bib 


& PP 
Abb 


ATP 

Aib 


APT 
nb X 


TPP 


1 


018 


10 


GAC 


TV f O 

ATC 


O AO 

GAG 


A TP 

Alb 


AGG 


par 

bnb 


TPT 
ibi 


PPA 
bbA 


TPP 

X bb 


TPP 

X bb 


PTP 


TPT 






oo a 
GCA 


lCl 


Of A 

G1A 


OO A 


OA o 
bAb 


APA 
nun 


PTP 
bib 


APP 

Abb 


ATP 
niu 


APT 
nw x 


TPP 


1 


08 




O AO 

GAC 


ATP 
A1C 


O AO 

GAG 


a tp 

HI b 


APP 


PAP 
bnb 


TPT 
ibi 


PPA 

bbn 


TPP 

X Uv 


TPP 


CTG 


TCT 






OO TV 

GCA 


1C I 


Of A 

G1A 


PPA 
bun 


PAP 
bAb 


APA 
nun 


PTP 
bib 


APP 
nbb 


ATP 
n x v* 


APT 
nv^ x 


TGC 


i 


A20 




o a o 

GAC 


A f O 

ATC 


O A O 

GAG 


TV fO 

A1G 


AGG 


par* 
bAb 


TPT 
lbl 


PPA 
bbA 


TPP 

i bb 


TPP 

i bb 


PTP 


TPT 




lo 


OOA 

GCA 


mrim 
1L1 


ptzs. 
Gl A 


GGA 


bAb 


APA 
nbn 


PTP 
bib 


APP 

Abb 


ATP 

Aib 


APT 
nw x 


TPP 


J 


A30 




A A O 

AAC 


A1C 


GAG 


A f o 

Alb 


APP 

Abb 


PAP 
bnb 


TPT 
ibi 


PPA 

bbn 


TPT 

ibi 


PPP 


ATP 


TCT 

X X 






ppa 

bun 


TOT 


PT A 
b 1 A 


PPA 
bun 


PAP 
bnb 


APA 


PTP 


ACC 


ATC 


ACT 


TGT 


I 


L14 




pap 

bAb 


ATP 


PAP 
bAb 


ATP 


APP 


PAP 


TPT 


CCA 


TCC 


TCA 


CTG 


TCT 






PPA 
bbA 


TPT 


PTA 


PPA 


PAP 


APA 


GTC 


ACC 


ATC 


ACT 


TGT 


| 


LI 


90 


pap 


ATP 


PAP 


ATP 


APP 


CAG 


TCT 


CCA 


TCC 


TCA 


CTG 


TCT 






ppa 

bbn 


TPT 


PTA 


PPA 


PAP 


APA 
nun 


GTC 


ACC 


ATC 


ACT 


TGT 


i 


L15 




ppp 


ATP 
Ai b 


PAP 
bnb 


TTP 


APP 


PAP 


TCT 


CCA 


TCC 


TCC 


CTG 


TCT 






rrj\ 

bbA 


TPT 


PTA 
win 


PPA 


PAP 


APA 


PTP 


ACC 


ATC 


ACT 


TGC 


j 


L4 




ppp 


ATP 
nx b 


PAP 
bnvj 


TTP 


APP 


CAG 


TCT 


CCA 


TCC 


TCC 


CTG 


TCT 




25 


GCA 


TCT 


GTA 


GGA 


GAC 


AGA 


GTC 


7\ OO 

ACC 


ATC 


A Of 

ACT 


f OO 

TGC 


i 


T 1 Q 
lil O 




GAC 


ATC 


CAG 


ATG 


ACC 


CAG 


TCT 


CCA 


TCT 


TCC 


GTG 


TCT 






GCA 


TCT 


GTA 


GGA 


GAC 


AGA 


GTC 


ACC 


ATC 


ACT 


TGT 


j 


L5 




GAC 


ATC 


CAG 


ATG 


ACC 


CAG 


TCT 


CCA 


TCT 


TCT 


GTG 


TCT 






GCA 


TCT 


GTA 


GGA 


GAC 


AGA 


GTC 


ACC 


ATC 


ACT 


TGT 


1 


L19 


30 


GAC 


ATC 


CAG 


TTG 


ACC 


CAG 


TCT 


CCA 


TCC 


TTC 


CTG 


TCT 






GCA 


TCT 


GTA 


GGA 


GAC 


AGA 


GTC 


ACC 


ATC 


ACT 


TGC 


j 


L8 




GCC 


ATC 


CGG 


ATG 


ACC 


CAG 


TCT 


CCA 


TTC 


TCC 


CTG 


TCT 






GCA 


TCT 


GTA 


GGA 


GAC 


AGA 


GTC 


ACC 


ATC 


ACT 


TGC 


j 


L23 




GCC 


ATC 


CGG 


ATG 


ACC 


CAG 


TCT 


CCA 


TCC 


TCA 


TTC 


TCT 




35 


GCA 


TCT 


ACA 


GGA 


GAC 


AGA 


GTC 


ACC 


ATC 


ACT 


TGT 


1 


L9 
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ptp fiTO van stc nrr pap tpt 

GCA TCT ACA GGA GAC AGA GTC 
GCC ATC CAG ATG ACC CAG TCT 
GCA TCT GTA GGA GAC AGA GTC 
5 GAC ATC CAG ATG ACC CAG TCT 
GCA TCT GTA GGA GAC AGA GTC 
GAT ATT GTG ATG ACC CAG ACT 
GTC ACC CCT GGA GAG CCG GCC 
GAT ATT GTG ATG ACC CAG ACT 

10 GTC ACC CCT GGA GAG CCG GCC 
GAT GTT GTG ATG ACT CAG TCT 
GTC ACC CTT GGA CAG CCG GCC 
GAT GTT GTG ATG ACT CAG TCT 
GTC ACC CTT GGA CAG CCG GCC 

15 GAT ATT GTG ATG ACC CAG ACT 
GTC ACC CCT GGA CAG CCG GCC 
GAT ATT GTG ATG ACC CAG ACT 
GTC ACC CCT GGA CAG CCG GCC 
GAT ATT GTG ATG ACT CAG TCT 

20 GTC ACC CCT GGA GAG CCG GCC 
GAT ATT GTG ATG ACT CAG TCT 
GTC ACC CCT GGA GAG CCG GCC 
GAT ATT GTG ATG ACC CAG ACT 
GTC ACC CTT GGA CAG CCG GCC 

25 GAA ATT GTG TTG ACG CAG TCT 
TTG TCT CCA GGG GAA AGA GCC 
GAA ATT GTG TTG ACG CAG TCT 
TTG TCT CCA GGG GAA AGA GCC 
GAA ATA GTG ATG ACG CAG TCT 

30 GTG TCT CCA GGG GAA AGA GCC 
GAA ATA GTG ATG ACG CAG TCT 
GTG TCT CCA GGG GAA AGA GCC 
GAA ATT GTG TTG ACA CAG TCT 
TTG TCT CCA GGG GAA AGA GCC 

35 GAA ATT GTG TTG ACA CAG TCT 
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pp ft 


TPP 


TTA 


PTP 


TPT 




a pp 


Al C 


AC1 


TPT 
1 CI 


j 


T O A 


pp A 
CCA 


ICC 


ICC 


PTP. 
Liu 


TPT 
1C i 




ACC 


Al C 


ROT 
AC! 


TPP 
1 CC 


t 


Til 

LI JL 


ppnp 
CC1 


ICC 


APp 
ACC 


PTP 
C1C 


TPT 
1 CI 




app 


Al C 


APT 
ACi 


TPP 
1 CC 


t 


T 1 O 




CI C 


Tpp 
ICC 


PTP 
CI c 


PPP 

CCC 




TPP 

ICC 


A.TP 

Al C 


ICC 


TPP 

1 cc 


1 


Pil 1 


pp a 
CCA 


CI c 


TPP 

ICC 


PTP 

C1C 


PPP 
CCC 




ICC 


A TP 

Al C 


rppp 
ICC 


T»PP 
1 CC 


1 


p*l 


pp 7\ 

CCA 


pmp 

C1C 


rnpp 

TCC 


CTC 


PPP 

CCC 




rppp 

ICC 


A1C 


rnpr« 

ICC 


TPP 

ICC 


1 


AIT 

Al / 


CCA 


CI c 


TPP 

ICC 


PTP 
Lib 


PPP 
CCC 




tpp 

ICC 


AiC 


TPP 

ICC 


TPP 

iCC 


t 


A 1 

Al 


pp 7\ 
CCA 


C1C 


1C1 


PTP 

C1C 


rppp 

ICC 




ICC 


ATP 

AIC 


ICC 


ICC 


f 


A 1 Q 

Alo 




ptp 
C1C 


TPT 
1C1 


PTP 

CiC 


mpp 

ICC 




ICC 


7\ mp 
AIC 


T»PP 

ICC 


1 CC 


1 


A *5 
A<£ 


pp 7\ 
CCA 


PTP 

C1C 


1 CC 


PTP 

C lC 


PPP 

CCC 




tpp 
1 cc 


a tp 
AIC 


1 cc 


TPP 

1 CC 


1 


Al 


ppzv 

CCA 


PTP 
CI C 


TPP 

1 cc 


PTP 

C 1 c 


PPP 

CCC 




ICC 


Al C 


1 CC 


TPP 
ICC 


1 


AO 


ppa 

CC/i 


PTP 
C 1 C 


TPP 
1 CC 


tp a 

1 CA 


PPT 

CC 1 




TPP 


ZXTP 
Al C 


T»pp 
1 CC 


TP.P 
1 cc 


1 


AZ O 


ppa 


P.PP 

OCC 


app 


PTP 
C lb 


TPT 
1 C 1 




ACC 


CTC 


TCC 


TGC 


1 


A27 


CCA 


GCC 


ACC 


CTG 


TCT 




ACC 


CTC 


TCC 


TGC 


j 


All 


CCA 


GCC 


ACC 


CTG 


TCT 




ACC 


CTC 


TCC 


TGC 


j 


L2 


CCA 


GCC 


ACC 


CTG 


TCT 




ACC 


CTC 


TCC 


TGC 


1 


L16 


CCA 


GCC 


ACC 


CTG 


TCT 




ACC 


CTC 


TCC 


TGC 


i 


L6 


CCA 


GCC 


ACC 


CTG 


TCT 
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TTG TCT CCA GGG GAA AGA GCC 
GAA ATT GTA ATG ACA CAG TCT 
TTG TCT CCA GGG GAA. AGA GCC 
GAC ATC GTG ATG ACC CAG TCT 
5 GTG TCT CTG GGC GAG AGG GCC 
GAA ACG ACA CTC ACG CAG TCT 
GCG ACT CCA GGA GAC AAA GTC 
GAA ATT GTG CTG ACT CAG TCT 
GTG ACT CCA AAG GAG AAA GTC 
10 GAA ATT GTG CTG ACT CAG TCT 
GTG ACT CCA AAG GAG AAA GTC 
GAT GTT GTG ATG ACA CAG TCT 
GTG ACT CCA GGG GAG AAA GTC 



ACC CTC TCC TGC ! L20 

CCA GCC ACC CTG TCT 

ACC CTC TCC TGC ! L25 

CCA GAC TCC CTG GCT 

ACC ATC AAC TGC ! B3 

CCA GCA TTC ATG TCA 

AAC ATC TCC TGC ! B2 

CCA GAC TTT CAG TCT 

ACC ATC ACC TGC ! A26 

CCA GAC TTT CAG TCT 

ACC ATC ACC TGC ! A10 

CCA GCT TTC CTC TCT 

ACC ATC ACC TGC ! A14 



WO 02/083872 



118 



PCT/US02/12405 



X 
u 



c 
2 



o 

CM 
O 



s 



to 



Pi 

a 
a 
m 

c 



A 
! 



S3 



CM 
CM 



3 

cm 



o 

CM 



CM 



T3 
C 
D 
O 



4J 
-H 

CO 
OS 

w 

a: 



NO 



XI 

m 



> 



3 



LO 
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MpyCH 
4V 


1636 


1736 


$ 

OO 

T— 


















i 














Mnll 




1726 






1956 


2056 


2156 


2256 


2356 


2456 


2556 


2656 


2729 2756 




2860 


2960 


3060 


3160 


BsmAI 


1618 1647 


1718 1747 


1818 1847 




« 




2118 


2218 


• 




2518 


2618 






2818 2839 


2918 2939 


3018 3039 


3118 3139 




1615 


1715 


1815 




















» 








■ 




PflFI 


16121649 


1712 1749 


1812 1849 








2112 


2212 






2512 


2612 






2812 


2912 


3012 


3112 


A 
1 

1 

~ V 
-g A 


1608 1623 


1703 1723 


1803 
















i 










• 






MslI 


1603 


1703 


1803 


































L24 1601-1669 


Lll 1701-1769 


L12 1801-1869 


VKII 


o 

CN 

o 

CN 

o 


O! 2001-2069 


CN 
NO 

CN 

o 

CN 
< 


A1 2201-2269 


A18 2301-2369 


A2 2401-2469 


A19 2501-2569 


A3 2601-2669 


A23 2701-2769 


VKIIl 


A27 2801-2869 


A11 2901-2969 


8 
«n 

o 
o 
rn 


LI 6 3101-3169 



Lf) O 
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Table 10 Lambda FR1 GLG sequences 



! VL1 







CAG 


TCT 


GTG 


CTG 


ACT 


CAG 


CCA 


CCC 


TCG GTG 


TCT GAA 






GCC 


CCC 


AGG 


CAG 


AGG 


GTC 


ACC 


ATC 


TCC TGT 


! la 


5 




cag 


tct 


gtg 


ctg 


acG 


cag 


ccG 


ccc 


tcA gtg 


tct gGG 






gcc 


ccA 


Ggg 


cag 


agg 


gtc 


acc 


ate 


tec tgC 


! le 






cag 


tct 


gtg 


ctg 


act 


cag 


cca 


ccc 


tcA gCg 


tct gGG 






Acc 


ccc 


Ggg 


cag 


agg 


gtc 


acc 


ate 


tcT tgt 


! lc 






cag 


tct 


gtg 


ctg 


act 


cag 


cca 


ccc 


tcA gCg 


tct gGG 


10 




Acc 


ccc 


Ggg 


cag 


agg 


gtc 


acc 


ate 


tcT tgt 


! lg 






cag 


tct 


gtg 


Ttg 


acG 


cag 


ccG 


ccc 


tcA gtg 


tct gCG 






gcc 


ccA 


GgA 


cag 


aAg gtc 


acc 


ate 


tec tgC 


! lb 




! VL2 


























CAG 


TCT 


GCC 


CTG 


ACT 


CAG 


CCT 


CCC 


TCC GCG 


TCC GGG 


15 




TCT 


CCT 


GGA 


CAG 


TCA 


GTC 


ACC 


ATC 


TCC TGC 
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cag 


tct 


gcc 


ctg 


act 


cag 


cct 


cGc 


tcA gTg 


tec ggg 






tct 


cct 


gga 


cag 


tea 


gtc 


acc 


ate 


tec tgc! 
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cag 


tct 


gcc 


ctg 
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tcT ggg 
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tcG 
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tec tgc 
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agG 
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! 31 



WO 02/083872 



PCT/US02/12405 



- 128 - 







tCC 
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CTG 


CCT 


GTG 


CTG 
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CAG 


CCC 


CCG 


TCT 


GCA 


TCT GCC 






TTG 


CTG 


GGA 


GCC 


TCG 


ATC 


AAG 


CTC 


ACC 


TGC 


! 4c 






cAg 


cct 


gtg 


ctg 


act 


caA 


TcA 


TcC 


tct 


gcc 


tct gcT 
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ctg 


gga 


Tec 


teg 


Gtc 


aag 
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cAg 


cTt 


gtg 


ctg 
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TcG 
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tct 


gcC 


tct gcc 


15 
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gga 
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teg 
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aag 
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tgc 


! 4b 






CAG 


CCT 


GTG 


CTG 


ACT 


CAG 


CCA 


CCT 


TCC 


TCC 


TCC GCA 






TCT 


CCT 


GGA 


GAA 


TCC 


GCC 


AGA 


CTC 


ACC 


TGC 


! 5e 






cag 


Get 


gtg 


ctg 


act 


cag 


ccG 


Get 


tec 


CTc 


tcT gca 


20 




tct 


cct 


gga 


gCa 


tcA gcc agT 


etc 


acc 


tgc 


! 5c 






cag 


cct 


gtg 


ctg 


act 


cag 


cca 


Tct 


tec 


CAT 


tcT gca 






tct 


Tct 


gga 


gCa 


tcA gTc 


aga 


etc 


acc 


tgc 


! 5b 




! VL6 




























AAT 


TTT 


ATG 


CTG 


ACT 


CAG 


CCC 


CAC 


TCT 


GTG 


TCG GAG 


25 


! VL7 


TCT 


CCG 


GGG 


AAG 


ACG 


GTA 


ACC 


ATC 


TCC 


TGC 


! 6a 






CAG 


ACT 


GTG 


GTG 


ACT 


CAG 


GAG 


CCC 


TCA 


CTG 


ACT GTG 






TCC 


CCA 


GGA 


GGG 


ACA 


GTC 


ACT 


CTC 


ACC 


TGT 


! 7a 






cag 


Get 


gtg 


gtg 


act 


cag 


gag 


CCC 


tea 


Ctg 


act gtg 


30 




tec 


cca 


gga 


ggg 


aca 


gtc act 


etc 


acc 


tgt 


! 7b 




! VL8 




























CAG 


ACT 


GTG 


GTG 


ACC 


CAG 


GAG 


CCA 


TCG 


TTC 


TCA GTG 






TCC 


CCT 


GGA 


GGG 


ACA 


GTC 


ACA 


CTC 


ACT 


TGT 


! 8a 
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VL9 

CAG CCT GTG CTG ACT CAG CCA CCT TCT GCA TCA GCC 
TCC CTG GGA GCC TCG GTC ACA CTC ACC TGC ! 9a 

VL10 

CAG GCA GGG CTG ACT CAG CCA CCC TCG GTG TCC AAG 
GGC TTG AGA CAG ACC GCC ACA CTC ACC TGC ! 10a 
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Table 11 RERSs found in human lambda FR1 GLGs 
! There are 31 lambda GLGs 
Mlyl NnnnnnGACTC 25 



1: 


6 


3: 


6 


4: 


6 


6: 


6 


7: 


6 


8: 


6 


5 9: 


6 


10: 


6 


11: 


6 


12: 


6 


15: 


6 


16: 


6 


20: 


6 


21: 


6 


22: 


6 


23: 


6 


23: 


50 


24: 


6 


25: 


6 


25: 


50 


26: 


6 


27: 


6 


28: 


6 


30: 


6 



31: 6 

There are 23 hits at base# 

-"- GAGTCNNNNNn 
26: 34 



20 



25 



30 



Mwol GCNNNNNnngc 



20 



15 1: 


9 


2: 


9 


3: 


9 


4: 


9 


11: 


9 


11: 


12: 


9 


13: 


9 


14: 


9 


16: 


9 


17: 


9 


18: 


19: 


9 


20: 


9 


23: 


9 


24: 


9 


25: 


9 


26: 


30: 


9 


31: 


9 

















There are 19 hits at base# 
Hinfl Gantc 



27 



1: 


12 


3: 


12 


4: 


12 


6: 


12 


7: 


12 


8: 


9: 


12 


10: 


12 


11: 


12 


12: 


12 


15: 


12 


16: 


20: 


12 


21: 


12 


22: 


12 


23: 


.12 


23: 


46 


23: 


24: 


12 


25: 


12 


25: 


56 


26: 


12 


26: 


34 


27: 


28: 


12 


30: 


12 


31: 


12 













There are 23 hits at base! 12 



Plel gactc 



1: 12 3: 12 4: 12 
9: 12 10: 12 11: 12 
20: 12 21: 12 22: 12 
25: 12 25: 56 26: 12 
31: 12 

There are 23 hits at base# 12 



25 
6: 12 
12: 12 
23: 12 
27: 12 



7: 12 

15: 12 

23: 56 

28: 12 



9 
9 



12 
12 
56 
12 



8: 12 

16: 12 

24: 12 

30: 12 



35 -"- gagtc 



1 
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26: 34 



Ddel Ctnag 32 



1: 


14 


2: 


24 


3: 


14 


3: 


24 


4: 


14 


4: 


24 


5: 


24 


6: 


14 


7: 


14 


7: 


24 


8: 


14 


9: 


14 


10: 


14 


11: 


14 


11: 


24 


12: 


14 


12: 


24 


15: 


5 


15: 


14 


16: 


14 


16: 


24 


19: 


24 


20: 


14 


23: 


14 


24: 


14 


25: 


14 


26: 


14 


27: 


14 


28: 


14 


29: 


30 


30: 


14 


31: 


14 



















10 There are 21 hits at base# 14 
BsaJI Ccnngg 38 



1: 


23 


1: 


40 


2: 


39 


2: 


40 


3: 


39 


3: 


40 


4: 


39 


4: 


40 


5: 


39 


11: 


39 


12: 


38 


12: 


39 


13: 


23 


13: 


39 


14: 


23 


14: 


39 


15: 


38 


16: 


39 


17: 


23 


17: 


39 


18: 


23 


18: 


39 


21: 


38 


21: 


39 


21: 


47 


22: 


38 


22: 


39 


22: 


47 


26: 


40 


27: 


39 


28: 


39 


29: 


14 


29: 


39 


30: 


38 


30: 


39 


30: 


47 


31: 


23 


31: 


32 



















20 There are 17 hits at base# 39 
There are 5 hits at base# 38 

There are 5 hits at base# 40 Makes cleavage ragged. 

Mnll cctc 35 



1: 


23 


2: 


23 


3: 


23 


4: 


23 


5: 


23 


6: 


19 


6: 


23 


7: 


19 


8: 


23 


9: 


19 


9: 


23 


10: 


23 


11: 


23 


13: 


23 


14: 


23 


16: 


23 


17: 


23 


18: 


23 


19: 


23 


20: 


47 


21: 


23 


21: 


29 


21: 


47 


22: 


23 


22: 


29 


22: 


35 


22: 


47 


23: 


26 


23: 


29 


24: 


27 


27: 


23 


28: 


23 


30: 


35 


30: 


47 


31: 


23 







30 There are 21 hits at base* 23 
There are 3 hits at base# 19 
There are 3 hits at base# 29 
There are 1 hits at base# 26 

There are 1 hits at base# 27 These could make cleavage ragged. 

35 gagg 7 
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1: 48 


2: 


4 A 


"3 . 
O • 


A P 


A * 


A R 
ft 0 




4 A 




A A 




29: 44 
























BssKI Nccngg 










39 










5 


1: 40 


2: 


39 


3: 


39 


3: 


40 


4: 


39 


4: 


40 




5: 39 


6: 


31 


6: 


39 


7: 


31 


7: 


39 


8: 


39 




9: 31 


9: 


39 


10: 


39 


11: 


39 


12: 


38 


12: 


52 




13: 39 


13: 


52 


14: 


52 


16: 


39 


16: 


52 


17: 


39 




17: 52 


18: 


39 


18: 


52 


19: 


39 


19: 


52 


21: 


38 


10 


22: 38 


23: 


39 


24: 


39 


26: 


39 


27: 


39 


28: 


39 




29: 14 


29: 


39 


30: 


38 















15 



There are 21 hits at base# 39 

There are 4 hits at base# 38 

There are 3 hits at base# 31 

There are 3 hits at base# 40 Ragged 



20 



25 



BstNI CCwgg 



30 



1: 


41 


2: 


40 


5: 


40 


6: 


40 


7: 


40 


8: 


40 


9: 


40 


10: 


40 


11: 


40 


12: 


39 


12: 


53 


13: 


40 


13: 


53 


14: 


53 


16: 


40 


16: 


53 


17: 


40 


17: 


53 


18: 


40 


18: 


53 


19: 


53 


21: 


39 


22: 


39 


23: 


40 


24: 


40 


27: 


40 


28: 


40 


29: 


15 


29: 


40 


30: 


39 



There are 17 hits at base# 40 
There are 7 hits at base* 53 
There are 4 hits at base* 39 
There are 1 hits at base* 41 Ragged 



30 



PspGI ccwgg 
1: 41 2 
9: 40 
13: 53 
18: 40 
40 



35 



30 

40 5: 40 6: 40 7: 40 8: 40 

10: 40 11: 40 12: 39 12: 53 13: 40 

14: 53 16: 40 16: 53 17: 40 17: 53 

18: 53 19: 53 21: 39 22: 39 23: 40 

27: 40 28: 40 29: 15 29: 40 30: 39 
There are 17 hits at base# 40 
There are 7 hits at base* 53 



24 
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There are 4 hits at base# 39 
There are 1 hits at base# 41 



ScrFI CCngg 39 

5 1: 41 2: 40 3: 40 3: 41 4: 40 4: 41 

5: 40 6: 32 6: 40 7: 32 7: 40 8: 40 

9: 32 9: 40 10: 40 11: 40 12: 39 12: 53 

13: 40 13: 53 14: 53 16: 40 16: 53 17: 40 

17: 53 18: 40 18: 53 19: 40 19: 53 21: 39 

10 22: 39 23: 40 24: 40 26: 40 27: 40 28: 40 

29: 15 29: 40 30: 39 

There are 21 hits at base# 40 
There are 4 hits at base# 39 
There are 3 hits at base# 41 

15 

Maelll gtnac 16 

1: 52 2: 52 3: 52 4: 52 5: 52 6: 52 

7: 52 9: 52 26: 52 27: 10 27: 52 28: 10 

28: 52 29: 10 29: 52 30: 52 

20 There are 13 hits at base# 52 

Tsp45I gtsac 15 

1: 52 2: 52 3: 52 4: 52 5: 52 6: 52 

7: 52 9: 52 27: 10 27: 52 28: 10 28: 52 

25 29: 10 29: 52 30: 52 

There are 12 hits at base# 52 

HphI tcacc 26 

1: 53 2: 53 3: 53 4: 53 5: 53 6: 53 

30 7: 53 8: 53 9: 53 10: 53 11: 59 13: 59 

14: 59 17: 59 18: 59 19: 59 20: 59 21: 59 

22: 59 23: 59 24: 59 25: 59 27: 59 28: 59 
30: 59 31: 59 

There are 16 hits at base# 59 
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There are 10 hits at base# 53 

BspMI ACCTGCNNNNn 14 
11: 61 13: 61 14: 61 17: 61 18: 61 19: 61 
20: 61 21: 61 22: 61 23: 61 24: 61 25: 61 
30: 61 31: 61 

There are 14 hits at base# 61 Goes into CDR1 
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Table 12: 


Matches to URE 


FR3 adapters in 


79 human HC. 






A. List or 


Heavy-chains 


genes sampled 








AF008566 


AF103367 


HSA235674 


HSU94417 


S83240 




AF035043 


AF103368 


HSA235673 


HSU94418 


SABVH369 


5 


AF103026 


AF103369 


HSA240559 


HSU96389 


SADEIGVH 




afl03033 


AF103370 


HSCB201 


HSU96391 


SAH2IGVH 




AF103061 


afl03371 


HSIGGVHC 


HSU96392 


SDA3IGVH 




Afl03072 


AF103372 


HSU44791 


HSU96395 


SIGVHTTD 




afl03078 


AF158381 


HSU44793 


HSZ93849 


SUK4IGVH 


10 


AF103099 


E05213 


HSU82771 


HSZ93850 






AF103102 


E05886 


HSU82949 


HSZ93851 






AF103103 


E05887 


HSU82950 


HSZ93853 






AF103174 


HSA235661 


HSU82952 


HSZ93855 






AF103186 


HSA235664 


HSU82961 


HSZ93857 




15 


afl03187 


HSA235660 


HSU86522 


HSZ93860 






AF103195 


HSA235659 


HSU86523 


HSZ93863 






afl03277 


HSA235678 


HSU92452 


MCOMFRAA 






afl03286 


HSA235677 


HSU94412 


MCOMFRVA 






AF103309 


HSA235676 


HSU94415 


S82745 




20 


afl03343 


HSA235675 


HSU94416 


S82764 





Table 12B. Testing all distinct GLGs from bases 89.1 to 93.2 of 



the heavy variable domain 





Id 


Nb 


0 


1 


2 


3 


4 




SEQ ID 




NO: 


















25 


1 


38 


15 


11 


10 


0 


2 


Seql gtgtattactgtgc 


25 




2 


19 


7 


6 


4 


2 


0 


Seq2 gtAtattactgtgc 


26 




3 


1 


0 


0 


1 


0 


0 


Seq3 gtgtattactgtAA 


27 




4 


7 


1 


5 


1 


0 


0 


Seq4 gtgtattactgtAc 


28 




5 


0 


0 


0 


0 


0 


0 


Seq5 Ttgtattactgtgc 


29 


30 


6 


0 


0 


0 


0 


0 


0 


Seq6 TtgtatCactgtgc 


30 




7 


3 


1 


0 


1 


1 


0 


Seq7 ACAtattactgtgc 


31 




8 


2 


0 


2 


0 


0 


0 


Seq8 ACgtattactgtgc 


32 




9 


9 


2 


2 


4 


1 


0 


Seq9 ATqtattactatac 


33 




Group 




26 


26 


21 


4 


2 






35 


Cumulative 




26 


52 


73 


77 


79 







Table 12C Most important URE recognition seqs in FR3 Heavy 

1 VHSzyl GTGtattactgtgc (ON_SHC103) (SEQ ID NO: 25) 

2 VHSzy2 GTAtattactgtgc (ON_SHC323) (SEQ ID NO:26) 

3 VHSzy4 GTGtattactgtac (ON_SHC349) (SEQ ID NO:28) 
40 4 VHSzy9 ATGtattactgtgc (ON_SHC5a) (SEQ ID NO: 33) 



Table 12D, testing 79 human HC V genes with four probes 

Number of sequences 79 

Number of bases 29143 
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Number of mismatches 
Id Best 0 1 2 3 4 5 ~" 
5 1 39 15 11 10 1 2 0 Seql gtgtattactgtgc (SEQ ID NO: 25) 

2 22 765301 Seq2 gtAtattactgtgc (SEQ ID NO: 26) 

3 7 151000 Seq4 gtgtattactgtAc (SEQ ID NO: 28) 

4 11 244100 Seq9 ATgtattactatac (SEQ ID NO: 33) 
Group 25 26 20 5 2 

10 Cumulative 25 51 71 76 78 



One sequence has five mismatches with sequences 2, 4, and 9; 
it is scored as best for 2. 



Id is the number of the adapter. 

Best is the number of sequence for which the identified 

15 adapter was the best available. 

The rest of the table shows how well the sequences match the 
adapters. For example, there are 10 sequences that match 
VHSzyl(Id-l) with 2 mismatches and are worse for all other 
adapters. In this sample, 90% come within 2 bases of one of 

20 the four adapters. 
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Table 13 

The following list of enzymes was taken from 
htto: //rebase.neb. com/cqi-bin/asvmmlist . 

I have removed the enzymes that a) cut within the recognition, b) cut on 
both sides of the recognition, or c) have fewer than 2 bases between 
recognition and closest cut site. 

REBASE Enzymes 
04/13/2001 

Type II restriction enzymes with asymmetric recognition sequences: 



En 7 umpq 


Recoanit ion Secmence 


Isoschizomers 


Suppliers 


Aarl 


CACCTGCNNNN A NNNN 

Vi«v\rf X VJ\<r li Ll LI 11 LI LI LI L^V 




y 




Acelll 


C AGCTCNNNNNNN A NNNN 








Bbr7I 


GAAGACNNNNNNN A NNNN - 








Bbvl 


GCAGCNNNNNNNN A NNNN~ 




y 




BbvII 


GAAGACNN A NNNN ~~ 








Bce83I 


CTTGAGNNNNNNNNNNNNNN NN A 








BceAI 


ACGGCNNNNNNNNNNNN A NN~ 


mm 


v 




Beef I 


ACGGCNNNNNNNNNNNN A N 








BciVI 


GTATCCNNNNN N A 


Bful 


v 




Bf il 


ACTGGGNNNN N A 


Bmrl 


y 




BinI 


GGATCNNNN A N 








BscAI 


GCATCNNNN A NN 








BseRI 


GAGGAGNNNNNNNN NN A 




y 




BsmFI 


GGGACNNNNNNNNNN A NNNN 


BspLUllIII 


y 




BspMI 


ACCTGCNNNN A NNNN 


Acc36I 


y 




Ecil 


GGCGGANNNNNNNNN NN A 




y 




Eco57I 


CTGAAGNNNNNNNNNNNNNN NN A 


BspKT5I 


y 




Faul 


CCCGCNNNN A NN 


BstFZ438I 


y 




Fokl 


GGATGNNNNNNNNN A NNNN 


BstPZ418I 


y 




Gsul 


CTGGAGNNNNNNNNNNNNNN NN A 




y 




Hgal 


G ACGCNNNNN A NNNNN 




y 




HphI 


GGTGANNNNNNN N A 


AsuHPI 


y 




MboII 


GAAGANNNNNNN N A 




y 




Mlyl 


GAGTCNNNNN A 


SchI 


y 




Mmel 


TCCRACNNNNNNNNNNNNNNNNNN 


NN A 






Mnll 


CCTCNNNNNN N A 




y 




Plel 


GAGTCNNNN A N 


PpsI 


y 




RleAI 


CCCACANNNNNNNNN NNN A 








SfaNI 


GCATCNNNNN A NNNN 


BspST5I 


y 




SspD5I 


GGTGANNNNNNNN A 








Sthl32I 


CCCGNNNN A NNNN 








StsI 


GGATGNNNNNNNNNN A NNNN 








Taqll 


GACCGANNNNNNNNN NN A , CACCCANNNNNNNNN NN A 






Tthlllll 


CAARCANNNNNNNNN NN A 








UbaPI 


CGAACG 









The notation is A means cut the upper strand and _ means cut the lower 
strand. If the upper and lower strand are cut at the same place, then only 
A appears. 
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Table 15: Use of Fokl as "Universal Restriction Enzyme" 

Fokl - for dsDNA, | represents sites of cleavage 

sites of cleavage 
5 1 -cacGGATGtg — nnnnnnn | nnnnnnn-3 ' ( SEQ ID N0:15) 
5 3 ' -gtgCCTACac — nnnnnnnnnnn | nnn-5 1 (SEQ ID NO:16) 

RECOG 

NITion of Fokl 

Case I 

5 1 - . . . gtg I tatt-actgtgc . . Substrate -3' (SEQ ID NO:17) 

10 3 1 -cac-ataa | tgacacq-, 

qtGTAGGcac\ 
5'- caCATCCgtg/(SEQ ID NO: 18) 

Case II 

5'-. . .gtgtatt lagac-tgc. .Substrate -3' (SEQ ID NO:19) 

1 5 j —cacataa -tctq I acg-5 * 

/gtgCCTACac 

\cacGGATGtg-3 1 (SEQ ID NO:20) 

Case III (Case I rotated 180 degrees) 

/gtgCCTACac-5' 
20 XcacGGATGtg— j 

qtqtctt I acaq-tcc-3 1 Adapter (SEQ ID NO: 21) 
3'-. . .cacagaa-tgtclagg. .substrate. .. .-5* (SEQ ID NO:22) 

Case IV (Case II rotated 180 degrees) 

3'- gtGTAGGcac\ (SEQ ID NO:23) 
25 r-caCATCCgtg/ 
5 1 -gag | tctc-actqaqc 
Substrate 3 1 - . . . ctc-agag I tgactcg ... -5 1 (SEQ ID NO: 24) 

Improved Fokl adapters 

Fokl - for dsDNA, | represents sites of cleavage 

30 Case I 

Stem ll f loop 5, stem 11, recognition 17 

5 1 - . . . catgtg | tatt-actgtgc . . Substrate .... -3 1 
3 ' -qtacac- ataa I tqacacq— } r T— , 

qtGTAGGcacG T 

35 5'- caCATCCgtgc C 

LttJ 
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Case II 

Stem 10, loop 5, stem 10, recognition 18 

5 1 - . . . gtgtatt | agac-tgctgcc . . Substrate . 

r-T-i — CSC 

T* gtgCCTACac 
C cacGGATGtg-3 1 

Case III (Case I rotated 180 degrees) 
Stem 11, loop 5, stem 11, recognition 20 

r T n 

T TgtgCCTACac-5 1 
G AcacGGATGtg-, 

LttJ crtcrtctt I acag-tccattctg-3 ' Adapter 

3 ' - . . . cacagaa-tgtc | aggtaagac , . substrate -5 1 

Case IV (Case II rotated 180 degrees) 
Stem 11, loop 4, stem 11, recognition 17 

3'- gtGTAGGcacc T 
j—caCATCCgtgg T 
5 ■ -atcgag I tctc-actoaac LtJ 
Substrate 3'-. . . tagctc-agag I tgactcg. . .-5' 



BseRI 

I sites of cleavage 
5 • -cacGAGGAGnnnnnnnnnn | nnnnn-3 1 
3 ■ -qtQctcctcnnnnnnnn j nnnnnnn-5 1 
RECOG 

NITion of BseRI 

Stem 11, loop 5, stem 11, recognition 19 

3 ' - gaacat I cg-ttaagccagta 5 • 

r T ~ T i cttgta-gc J aattcggtcat-3 ■ 

C GCTGAGGAGTC-J 

T cgactcctcag-5' An adapter for BseRI to cleave the substrate above. 
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What happens in the top strand: 

| site of cleavage in the upper strand 
(VL133-2a2*) 5'-g tct cct g | ga cag teg ate 

(VL133-31*) 5'-g gec ttg g | ga cag aca gtc 

(VL1 33-2c*) 5'-g tct cct g | ga cag tea gtc 

(VL1 33-lc*) 5'-g gec cca g | gg cag agg gtc 

Hie following Extenders and Bridges all encode the AA sequence of 2a2 for codons 1-15 
1 

(ONJ-amExl33) S'-ccTcTgAcTgAgT gcA cAg - 

2 3 4 5 6 7 8 9 10 11 12 
AGt gcT TtA acC caA ccG gcT AGT gtT AGC ggT- 

13 14 15 

tcCccGg! 2a2 
1 

;ON_LamBM33) {RQ 5 f -ccTcTgAcTgAgT gcA cAg - 

2 3 4 5 6 7 8 9 10 11 12 
AGt gcT TtA acC caA ccG gcT AGT gtT AGC ggT- 

13 14 15 

tcC ccG g ga cag teg at-3 f ! 2a2 ISl. B. the actual seq is the 
reverse complement of the 
one shown. 

;ON^LamB2-133) [Rq 5'-ccTcTgAcTgAgT gcA cAg - 

2 3 4 5 6 7 8 9 10 11 12 
AGt gcT TtA acC caA ccG gcT AGT gtT AGC ggT- 

13 14 15 

tcC ccG g ga cag aca gt-3 1 ! 31 N.B. the actual seq is die 
reverse complement of the 
one shown. 



(ON_LamB3-133) (RC] 5*-ccTcTgAcTgAgT gcA cAg - 

2 3 4 5 6 7 8 9 10 11 12 
AGt gcT TtA acC caA ccG gcT AGT gtT AGC ggT- 

13 14 15 

tcC ccG g ga cag tea gt -3'! 2c N.B. the actual seq is the 
reverse complement of the 
one shown. 

!(ON_LamB4-133) [RC] 5 1 -ccTcTgAcTgAgT gcA cAg - 



WO 02/083872 



! 2 3 4 5 6 7 

AGt gcT TtA acC caA ccG 

i 

5 ! 13 14 15 

tcC ccG g gg cag agg gt-3 
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8 9 10 11 12 
gcT AGT gtT AGC ggT-s 

' ! lc N.B. the actual seq is the 
reverse complement of the 
one shown . 



10 (ON_Laml33PCR) 5 1 -ccTcTgAcTgAgT gcA cAg AGt gc-3 1 
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Table 19: 
Enzyme 



Cleavage of 75 
Recognition* 



human light chains. 

Wch Ns Planned location of site 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



Afel 


AGCgct 


0 


0 


Aflll 


Cttaag 


0 


0 


Ag6l 


ArTTrnl- 

3 j - 


q 


A 


AscI 


GGcgcgcc 


0 


0 


Bglll 


Agatct 


0 


0 


BsiWI 


Cgtacg 


0 


0 


BspDI 


ATcgat 


0 


0 


BssHII 


Gcgcgc 


0 


0 


BstBI 


TTcgaa 


0 


0 


Drain 


CACNNNgtg 


0 


0 


EagI 


Cggccg 


0 


0 


Fsel 


GGCCGGcc 


0 


0 


Fspl 


TGCgca 


0 


0 


Hpal 


GTTaac 


0 


0 


Mfel 


Caattg 


0 


0 


Mlul 


Acgcgt 


0 


0 


Ncol 


Ccatgg 


0 


0 


Khel 


Gctagc 


0 


0 


NotI 


GCggccgc 


0 


0 


Nrul 


TCGcga 


0 


0 


Pad 


TTAATtaa 


0 


0 


Pmel 


GTTTaaac 


0 


0 


Pmll 


CACgtg 


0 


0 


Pvul 


CGATcg 


0 


0 


SacII 


CCGCgg 


0 


0 


Sail 


Gtcgac 


0 


0 


Sfil 


GGCCNNNNnggcc 


0 


0 


Sgf I 


GCGATcgc 


0 


0 


SnaBI 


TACgta 


0 


0 


StuI 


AGGcct 


0 


0 


Xbal 


Tctaga 


0 


0 


Aatll 


GACGTc 


1 


1 


Acll 


AAcgtt 


1 


1 


Asel 


ATtaat 


1 


1 


BsmI 


GAATGCN 


1 


1 


BspEI 


Tccgga 


1 


1 


BstXI 


CCANNNNNntgg 


1 


1 


DrdI 


GACNNNNnngtc 


1 


1 


Hindlll 


Aagctt 


1 


1 


Pcil 


Acatgt 


1 


1 


Sapl 


gaagagc 


1 


1 


Seal 


AGTact 






SexAI 


Accwggt 






Spel 


Actagt 






Tlil 


Ctcgag 






Xhol 


Ctcgag 






Bcgl 


egannnnnntge 


2 


2 


BlpI 


GCtnagc 


2 


2 


BssSI 


Ctcgtg 


2 


2 


BstAPI 


GCANNNNntgc 


2 


2 


Espl 


GCtnagc 


2 


2 


KasI 


Ggcgcc 


2 


2 


PflMI 


CCANNNNntgg 


2 


2 


XmnI 


GAANNnnttc 


2 


2 


ApaXI 


Gtgcac 


3 


3 



HC FR3 



After LC 



Heavy chain signal 
HC/anchor linker 
In linker after HC 



Heavy Chain signal 



HC FR1 
HC FR2 



LC signal seq 
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Nael 


GCCqqc 


3 


3 


NgoMI 


Gccqqc 


3 


3 


PvuII 


CAGctg 


3 


3 


RsrII 


CGgwccg 


3 


3 


BsrBI 


GAG egg 






BsrDI 


GCAATGNNn 


4 


4 


BstZ17l 


GTAtac 


4 


4 


EcoRI 


Gaattc 


4 


4 


SphI 


GCATGc 


4 


4 


Sspl 


AATatt 


4 


4 


AccI 


GTmkac 


5 


5 


Bell 


Tgatca 


5 


5 


BsmBX 


Nnnnnngagacg 


5 


5 


BsrGI 


Tgtaca 


5 


5 


Oral 


TTTaaa 


6 


6 


Ndel 


CAtatg 


6 


6 


Swal 


ATTTaaat 


6 


6 


BamHI 


Ggatcc 


7 


7 


Sad 


GAGCTc 


7 


7 


BciVI 


GTATCCNNNNNN 


8 


8 


BsaBI 


GATNNnnatc 


8 


8 


Nsil 


ATGCAt 


8 


8 


Bspl20I 


Gggccc 


9 


9 


Apal 


GGGCCc 


9 


9 


PspOOMI 


Gqqccc 


9 


9 


BspHI 


Tcatga 


9 


11 


EcoRV 


GATatc 


9 


9 


Ahdl 


GACNNNnngtc 


11 


11 


Bbsl 


GAAGAC 


11 


14 


Psil 


TTAtaa 


12 


12 


Bsal 


GGTCTCNnnnn 


13 


15 


Xmal 


Cccqqq 


13 


14 


Aval 


Cycgrg 


14 


16 


Bgll 


GCCNNNNnggc 


14 


17 


AlwNI 


CAGNNNctg 


16 


16 


BspMI 


ACCTGC 


17 


19 


Xcml 


CCANNNNNnnnntgg 


17 


26 


BstEII 


Ggtnacc 


19 


22 


Sse8387I 


CCTGCAgg 


20 


20 


Avrll 


Cctagg 


22 


22 


Hindi 


GTYrac 


22 


22 


Bsgl 


GTGCAG 


27 


29 


MscI 


TGGcca 


30 


34 


BseRI 


NNnnnnnnnnctcctc 


32 


35 


Bsu36l 


CCtnagg 


35 


37 


PstI 


CTGCAg 


35 


40 


Ecil 


nnnnnnnnntccgcc 


38 


40 


PpuMI 


RGgwccy 


41 


50 


Styl 


Ccwwgg 


44 


73 


Eco0109I 


RGgnccy 


46 


70 


Acc65I 


Ggtacc 


50 


51 


Kpnl 


GGTACc 


50 


51 


Bpml 


ctccag 


53 


82 


Avail 


Ggwcc 


71 


124 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 * cleavage occurs in the top strand after the last upper-case base. For REs 
that cut palindromic sequences, the lower strand is cut at the symmetrical 
site. 
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Table 20: Cleavage of 79 human heavy chains 



Enzvme 


Recoanition 


Nch 


Ns 


Afel 


AGCgct 


0 


0 


Af III 


Cttaag 


w 


V 


AscI 


GGcgcgcc 


0 


0 


BsiWI 


Cgtacg 


0 


0 


BspDI 


ATcgat 


0 


0 


BssHII 


Gcgcgc 


0 


0 


Fsel 


GGCCGGcc 


0 


0 


Hpal 


GTTaac 


0 


0 


Nhel 


Gctagc 


0 


0 


NotI 


GCggccgc 


0 


0 


Nrul 


TCGcga 


0 


0 


Nsil 


ATGCAt 


0 


0 


Pad 


TTAATtaa 


0 


0 


Pcil 


Acatgt 


0 


0 


Pmel 


GTTTaaac 


0 


0 


Pvul 


CGATcg 


0 


0 


RsrII 


CGgwccg 


0 


0 


Sapl 


gaagagc 


0 


0 


Sfil 


GGCCNNNNnggcc 


0 


0 


Sgfl 


GCGATcgc 


0 


0 


Swal 


ATTTaaat 


0 


0 


Acll 


AAcgtt 


1 


1 


Age I 


Accggt 


1 


1 


Asel 


ATtaat 


1 


1 


Avrll 


Cctagg 


1 


1 


BsmI 


GAATGCN 


1 


1 


BsrBI 


GAG egg 


1 


1 


BsrDI 


GCAATGNNn 


1 


1 


Oral 


TTTaaa 


1 


1 


Fspl 


TGCgca 


1 


1 


Hindlll 


Aagctt 


1 


1 


Mfel 


Caattg 


1 


1 


Nael 


GCCggc 


1 


1 


NgoMI 


Gccggc 


1 


1 


Spel 


Actagt 


1 


1 


Acc65I 


Ggtacc 


2 


2 


BstBI 


TTcgaa 


2 


2 


Kpnl 


GGTACc 


2 


2 


Mlul 


Acgcgt 


2 


2 


Ncol 


Ccatgg 


2 


2 


Ndel 


CAtatg 


2 


2 


Pmll 


CACgtg 


2 


2 


Xcml 


CCANNNNNnnnntgg 


2 


2 


Beg I 


egannnnnntge 


3 


3 


Bell 


Tgatca 


3 


3 


Bgll 


GCCNNNNnggc 


3 


3 


BsaBI 


GATNNnnatc 


3 


3 


BsrGI 


Tgtaca 


3 


3 


SnaBI 


TACgta 


3 


3 


Sse8387I 


CCTGCAgg 


3 


3 


ApaLl 


Gtgcac 


4 


4 


BspHI 


Tcatga 


4 


4 


BssSI 


Ctcgtg 


4 


4 


Psil 


TTAtaa 


4 


5 



Nch Ns Planned location of site 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



After LC 



HC Linker 

In linker, HC/anchor 



HC signal seq 



HC FR1 



In HC signal seq 
HC FR4 



LC Signal/FRl 
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SphI 


GCATGc 


4 


4 




Ahdl 


GACNNNnngtc 


5 


5 




BspEX 


Tccgga 


5 


5 


HC FR1 


MscI 


TGGcca 


5 


5 




SacI 


GAGCTc 


5 


5 




Seal 


AGTact 


5 


5 




SexAI 


Accwggt 


5 


6 




SsdI 

ST 


AATatt 


5 


5 




Tlil 


Ctcgag 


5 


5 




Xhol 


Ctcgag 


5 


5 




Bbsl 


GAAGAC 


7 


8 




BstAPI 


GCANNNNntgc 


7 


8 




BstZ17I 


GTAtac 


7 


7 




EcoRV 


GATatc 


7 


7 




EcoRI 


Gaattc 


8 


8 




BlpI 


GCtnagc 


9 


9 




Bsu36I 


CCtnagg 


9 


9 




Dralll 


CACNNNgtg 


9 


9 




Espl 


GCtnagc 


9 


9 




StuI 


AGGcct 


9 


13 




Xbal 


Tctaga 


9 


9 


HC FR3 


Bspl20I 


Gggccc 


10 


11 


CHI 


Apal 


GGGCCc 


10 


11 


CHI 


PspOOMI 


Gggccc 


10 


11 




BciVI 


GTATCCNNNNNN 


11 


11 




Sail 


Gtcgac 


11 


12 




DrdI 


GACNNNNnngtc 


12 


12 




KasI 


Ggcgcc 


12 


12 




Xmal 


Cccggg 


12 


14 




Bglll 


Agatct 


14 


14 




Hindi 


GTYrac 


16 


18 




BamHI 


Ggatcc 


17 


17 




PflMI 


CCANNNNntgg 


17 


18 




BsmBI 


Nnnnnngagacg 


18 


21 




BstXI 


CCANNNNNntgg 


18 


19 


HC FR2 


XmnI 


GAANNnnttc 


18 


18 




SacII 


CCGCqq 


19 


19 




PstI 


CTGCAg 


20 


24 




PvuII 


CAGctg 


20 


22 




Aval 


Cycgrg 


21 


24 




EagI 


Cggccg 


21 


22 




Aatll 


GACGTc 


22 


22 




BspMI 


ACCTGC 


27 


33 




AccI 


GTmkac 


30 


43 




Styl 


Ccwwgg 


36 


49 




AlwNI 


CAGNNNctg 


38 


44 




Bsal 


GGTCTCNnnnn 


38 


44 




PpuMI 


RGgwccy 


43 


46 




Bsgl 


GTGCAG 


44 


54 




BseRI 


NNnnnnnnnnctcctc 


48 


60 




Ecil 


nnnnnnnnnt ccgcc 


52 


57 




BstEIl 


Ggtnacc 


54 


61 


HC Fr4, 47/79 have one 


Eco0109I 


RGgnccy 


54 


86 




Bpml 


ctccag 


60 


121 




Avail 


Ggwcc 


71 


140 
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Table 21: MALI A3, annotated 
! MALIA3 9532 bases 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



1 


aat 


n ct 


act 


act 


att 


a rrf- 
— j 


ana 

— 3 — 


att 


<TAt 

~ - 


a nr. 

a 


acc 


ttt 


tea 


net 

.» — - 


C n C 


3 — — 


gene ii continued 
























49 


cca 


aat 


gaa 


aat 


ata 


get 


aaa 


cag 


gtt 


att 


gac 


cat 


tta 


cga 


aat 


gta 


97 


tct 


aat 


ggt 


caa 


act 


aaa 


tct 


act 


cgt 


teg 


cag 


aat 




gaa 


tea 


act 


14 5 


gtt 


aca 


tgg 


aat 


gaa 


act 


tec 


aga 


cac 


cgt 


act 


tta 


gtt 


gca 


fat- 
Lot 


f- 4- -a 

tea 


193 


aaa 


cat 


gtt 


gag 




cag 


cac 


cag 


att 


cag 


caa 


tta 


age 




aag 


cca 


241 


tec 


gca 


aaa 


atg 


ace 


tct 


tat 


caa 


aag 


gag 


caa 


tta 


aag 


gta 


etc 


tct 


289 


aat 


cct 


gac 


ctg 


ttg 


gag 


ttt 


get 


tec 


ggt 


ctg 


gtt 


cgc 


ttt 


gaa 


get 


337 


cga 


att 


aaa 


acg 


cga 


tat 


ttg 


aag 


tct 


ttc 


ggg 


ctt 


cct 






Cll 


385 


ttt 


gat 


gca 


ate 


cgc 


ttt 


get 


tct 


gac 


tat 


aat 


agt 


cag 


ggt 


aaa 


gac 


433 


ctg 


att 


ttt 


gat 


tta 


tgg 


tea 


ttc 


teg 


ttt 


tct 


gaa 


ctg 


ttt 


aaa 


gca 


481 


ttt 


gag 


ggg 


gat 


tea 


ATG 


aat 


att 


tat 


gac 


gat 


tec 


gca 


gta 


ttg 


gac 






RBS? . . . 






Start gene x, ii continues 










529 


get 


ate 


cag 


tct 


aaa 


cat 


ttt 


act 


att 


ace 


ccc 


tct 


ggc 


aaa 


act 


tct 


577 


ttt 


gca 


aaa 


gee 


tct 


cgc 


tat 


ttt 


ggt 


ttt 


tat 


cgt 


cgt 


ctg 


gta 


aac 


625 


gag 


ggt 


tat 


gat 


agt 


gtt 


get 


ctt 


act 


atg 


cct 


cgt 


aat 


tec 




tgg 


673 


cgt 


tat 


gta 


tct 


gca 


tta 


gtt 


gaa 


tgt 


ggt 


att 


cct 


aaa 


tct 


caa 


ctg 


721 


atg 


aat 


ctt 


tct 


ace 


tgt 


aat 


aat 


gtt 


gtt 


ccg 


tta 


gtt 


cgt 


ttt 


att 


769 


aac 


gta 


gat 


ttt 


tct 


tec 


caa 


cgt 


cct 


gac 


tgg 


tat 


aat 


gag 


cca 


gtt 


817 


ctt 


aaa 


_atc 


gca 


T2V2X 


































hna 


X & 


II 




















832 


ggtaattca ca 




























Ml 








E5 










Q10 










T15 




843 


ATG 


att 


aaa 


gtt 


gaa 


att 


aaa 


cca 


tct 


caa 


gee 


caa 


ttt 


act 


apt- 

act 


cgt 




Start gene V 




























S17 






S20 










P25 










E30 






891 


tct 


ggt 


gtt 


tct 


cgt 


cag 


ggc 


aag 


cct 


tat 


tea 


ctg 


aat 


gag 


cag 


ctt 








V35 










E40 










V45 








939 


tgt 


tac 


gtt 


gat 


ttg 


ggt 


aat 


gaa 


tat 


ccg 


gtt 


ctt 


gtc 


aag 


att 


act 






D50 










A55 










L60 










987 


ctt 


gat 


gaa 


ggt 


cag 


cca 


gee 


tat 


gcg 


cct 


ggt 


cTG 


TAC 


Acc 


gtt 


cat 


























BsrGI . 










L65 










V70 










S75 










R80 


1035 


ctg 


tec 


tct 


ttc 


aaa 


gtt 


ggt 


cag 


ttc 


ggt 


tec 


ctt 


atg 


att 


gac 


cgt 












P85 




K87 


end 


of V 














1083 


ctg 


cgc 


etc 


gtt 


ccg 


get 


aag 


TAA 


C 
















1108 


ATG 


gag 


cag 


gtc 


gcg 


gat 


ttc 


gac 


aca 


att 


tat 


cag 


gcg 


atg 








Start gene VII 


























1150 


ata 


caa 


ate 


tec 


gtt 


gta 


ctt 


tgt 


ttc 


gcg 


ctt 


ggt 


ata 


ate 
















VII and IX overlap. 




























S2 


V3 


L4 


V5 








S10 




1192 


get 


ggg 


ggt 


caa 


agA TGA 


gt gtt tta gtg tat tct ttc gee tct ttc gtt 














End 


VII 






























I start IX 






















L13 




W15 










G20 










T25 






E29 


1242 


tta 


ggt 


tgg tgc 


ctt 


cgt 


agt 


ggc 


att 


acg 


tat 


ttt 


ace cgt 


tta 


atg gaa 
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1293 act tec tc 

.... stop of IX, IX and VIII overlap by four bases 
1301 ATG aaa aag tct tta gtc etc aaa gec tct gta gee gtt get _aec etc 
Start signal sequence of viii. 

134 9 gtt ccg atg ctg tct ttc get get gag ggt gac gat ccc gca aaa gcg 

mature VIII > 

1397 gec ttt aac tec ctg caa gee tea gcg ace gaa tat ate ggt tat gcg 
1445 tgg gcg atg gtt gtt gtc att 
1466 gtc ggc gca act ate ggt ate aag ctg ttt aag 
1499 aaa ttc ace teg aaa gca ! 1515 
-35 .. 

1517 age tga taaaccgat acaattaaag gctccttttg 

-10 

1552 gagecttttt ttttGGAGAt ttt ! S.D. underlined 



< III signal sequence 

MKKLLFAI PLV 
1575 caac GTG aaa aaa tta tta ttc gca att cct tta gtt ! 1611 





V 


P 


F 


Y 


S 


H 


S 


A 


Q 


















1612 


gtt 


cct 


ttc 


tat 


tct 


cac 


aGT 


gcA Cag 


tCT 






























ApaLI . 




















1642 




GTC 


GTG 


ACG 


CAG 


CCG 


CCC 


TCA 


GTG 


TCT 


GGG 


GCC 


CCA 


GGG 


CAG 










AGG 


GTC 


ACC 


ATC 


TCC 


TGC 


ACT 


GGG 


AGC 


AGC 


TCC 


AAC 


ATC 


GGG 


GCA 








BstEII. . . 




























1729 




GGT 


TAT 


GAT 


GTA 


CAC 


TGG 


TAC 


CAG 


CAG 


CTT 


CCA 


GGA 


ACA 


GCC 


CCC 


AAA 


1777 




CTC 


CTC 


ATC 


TAT 


GGT 


AAC 


AGC 


AAT 


CGG 


CCC 


TCA 


GGG 


GTC 


CCT 


GAC 


CGA 


1825 




TTC 


TCT 


GGC 


TCC 


AAG 


TCT 


GGC 


ACC 


TCA 


GCC 


TCC 


CTG 


GCC 


ATC 


ACT 




1870 




GGG 


CTC 


CAG 


GCT 


GAG 


GAT 


GAG 


GCT 


GAT 


TAT 














1900 




TAC 


TGC 


CAG 


TCC 


TAT 


GAC 


AGC 


AGC 


CTG 


AGT 














1930 




GGC 


CTT 


TAT 


GTC 


TTC 


GGA 


ACT 


GGG 


ACC 


AAG 


GTC 


ACC 


GTC 






























BstEII . . . 










1969 




CTA 


GGT 


CAG 


CCC 


AAG 


GCC 


AAC 


CCC 


ACT 


GTC 


ACT 












2002 




CTG 


TTC 


CCG 


CCC 


TCC 


TCT 


GAG 


GAG 


CTC 


CAA 


GCC 


AAC 


AAG 


GCC 


ACA 


CTA 


2050 




GTG 


TGT 


CTG 


ATC 


AGT 


GAC 


TTC 


TAC 


CCG 


GGA 


GCT 


GTG 


ACA 


GTG 


GCC 


TGG 


2098 




AAG 


GCA 


GAT 


AGC 


AGC 


CCC 


GTC 


AAG 


GCG 


GGA 


GTG 


GAG 


ACC 


ACC 


ACA 


CCC 


2146 




TCC 


AAA 


CAA 


AGC 


AAC 


AAC 


AAG 


TAC 


GCG 


GCC 


AGC 


AGC 


TAT 


CTG 


AGC 


CTG 


2194 




ACG 


CCT 


GAG 


CAG 


TGG 


AAG 


TCC 


CAC 


AGA 


AGC 


TAC 


AGC 


TGC 


CAG 


GTC 


ACG 


2242 




CAT 


GAA 


GGG 


AGC 


ACC 


GTG 


GAG 


AAG 


ACA 


GTG 


GCC 


CCT 


ACA 


GAA 


TGT 


TCA 


2290 




TAA 


TAA 


ACCG CCTCCACCGG 


GCGCGCCAAT TCTATTTCAA GGAGACAGTC ATA 



AscI , 



2343 



PelB signal > 

MKYLLPTAAAGLLLL 
ATG AAA TAC CTA TTG CCT ACG GCA GCC GCT GGA TTG TTA TTA CTC 



16 17 18 19 20 
A A Q P A 
2388 gcG GCC cag ccG G CC 

Sfil 

NgoMI . . . (1/2) 
Ncol. . 



21 22 
M A 
atg g ee 
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FRKDP47/V3-23)- 
23 24 25 26 
E V Q L 



27 
L 



28 
E 



I Mfel | 



29 30 
S G 

tctjjggt i 



10 



15 



20 



25 



30 



35 



40 



45 



2433 



2478 



2523 



FR1 

31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 
GGLVQPGGSLRLSCA 
I ggc | ggt I ctt | gtt | cag | cct I ggt I ggt I tct Ittaj cgt | ctt | tct | tgc | get | 

FRl >) . , .CDR1 | FR2 

46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 
A S G FT FS S YAMS WVR 
I get | TCC | GGA 1 1 tc | act 1 1 tc J tct I tCG I TAC | Get | atg I tct | tgg | gt 1 1 cgC | 

! BspEI | | BsiWII IBstXI. 

FR2 > | . . . CDR2 

61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 
QAPGKGLEWVSAISG 
| CAa | get | ccT | GGt I aaa | ggt | ttg | gag j tgg | gtt | tct j get | ate | tct I ggt | 
.BstXI | 



2568 



2613 



2658 



2703 



CDR2 |— FR3— 

76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 
SGGSTY YADSVKGRF 
| tct | ggt I ggc I agt I act I tac I tat | get I gac I tec I gtt I aaa I ggt | cgc | ttc I 



PR 3 

91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 
TISRDNSKNTLYLQM 

I act I ate | TCT | AGA | gac I aac I tct I aag I aat | act I etc | tac I ttg I cag I atg | 
I Xbal | 

FR3 >| 

106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 
NSLRAEDTAVYYCAK 
I aac I agC I TTA 1 AGg I get I gag I gac | aCT i GCA| Gtc I tac | tat | tgc I get I aaa I 
lAflll [ | PstI | 

CDR3 | FR4 

121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 
DYEGTGYAFDIWGQG 
I gac I tat I gaa | ggt | act I ggt I tat I get 1 1 1 c | gaC I ATA | TGg I ggt I caal ggt I 

I Ndel I (1/4) 



FR4 >| 

50 ! 136 137 138 139 140 141 142 

T M V T V S S 
2748 | act | atG | GTC I ACC I gtc | tct | agt 
! | BstEII | 

! From BstEII onwards, pV323 is same as pCESl, except as noted. 
55 ! BstEII sites may occur in light chains; not likely to be unique in final 
! vector. 
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i 

! 143 144 145 146 147 148 149 150 151 152 

! ASTKGPSVFP 

2769 gcc tec acc aaG GGC CCa teg GTC TTC ccc 

5 I Bspl20I. BbsI...(2/2) 

! Apal .... 
j 

i 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 

! LAPSSKSTSGGTAAL 

10 2799 ctg gca ccC TCC TCc aag age acc tct ggg ggc aca gcg gcc ctg 

! ' BseRI ... (2/2) 
i 

i 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 

! GCLVKDYFPEPVTVS 

15 2844 ggc tgc ctg GTC AAG GAC TAC TTC CCc gaA CCG GTg acg gtg teg 

! Agel .... 
t 

! 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 

! WNSGALTS GVHTFPA 

20 2889 tgg aac tea GGC GCC ctg acc age ggc gtc cac acc ttc ccg get 

! Kasl. . . (1/4) 

! 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 

! VLQSSGLYSLSSVVT 

25 2934 gtc eta cag tCt age GGa etc tac tec etc age age gta gtg acc 

! ( Bsu3 61 ...) (knocked out) 
i 

! 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 

! VPSSSLGTQTYICNV 

30 2979 gtg ccC tCt tct age tTG Ggc acc cag acc tac ate tgc aac gtg 

! (BstXI )N.B. destruction of BstXI & Bpml sites. 

I 

'! 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 

! NHKPSNTKVDKKVEP 

35 3024 aat cac aag ccc age aac acc aag gtg gac aag aaa gtt gag ccc 
j 

! 243 244 245 

! KSCAAAHHHHHHSA 

3069 aaa tct tgt GCG GCC GCt cat cac cac cat cat cac tct get 

40 ! Not I 

i 

! EQKLISEEDLNGAA 

3111 gaa caa aaa etc ate tea gaa gag gat ctg aat ggt gcc gca 

i 

45 ! 

! DINDDRM ASGA 

3153 GAT ATC aac gat gat cgt atg get AGC ggc gcc 

! rEK cleavage site Nhel... Kasl... 

! EcoRV. . 
50 ! 

! Domain 1 

! AETVESCLA 

3183 get gaa act gtt gaa agt tgt tta gca 

i 

55 i 

! KPHTEISF 

3210 aaa ccc cat aca gaa aat tea ttt 

» 

! TNVWKDDKT 

60 3234 aCT AAC GTC TGG AAA GAC GAC AAA Act 
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! LDRYANYEGCLWNATGV 

3261 tta gat cgt tac get aac tat gag ggt tgt ctg tgG AAT GCt aca ggc gtt 
! BsmI 

! VVCTGDETQCYGTWVPI 

3312 gta gtt tgt act ggt GAC GAA ACT CAG TGT TAC GGT ACA TGG GTT cct att 

i 

! G L A I PEN 

3363 ggg ctt get ate cct gaa aat 

! LI linker 

! EGGGSEGGGS 
3384 gag ggt ggt ggc tct gag ggt ggc ggt tct 

I 

! EGGGSEGGGT 
3414 gag ggt ggc ggt tct gag ggt ggc ggt act 

i 

! Domain 2 



3444 


aaa 


cct 


cct 


gag 


tac 


ggt 


gat 


aca cct 


att 


ccg 


ggc 


tat 


act 


tat 


ate 


aac 


3495 


cct 


etc 


gac 


ggc 


act 


tat 


ccg 


cct ggt 


act 


gag 


caa 


aac 


ccc 


get 


aat 


cct 


3546 


aat 


cct 


tct 


ctt 


GAG 


GAG 


tct 


cag cct 


ctt 


aat 


act 


ttc 


atg 


ttt 


cag 


aat 












BseRI 






















3597 


aat 


agg 


ttc 


cga 


aat 


agg 


cag 


ggg gca 


tta 


act 


gtt 


tat 


acg 


ggc 


act 




3645 


gtt 


act 


caa 


ggc 


act 


gac 


ccc 


gtt aaa 


act 


tat 


tac 


cag 


tac 


act 


cct 




3693 gta 


tea 


tea 


aaa 


gee 


atg 


tat 


gac get 


tac tgg 


aac 


ggt 


aaa 


ttc 


AGA 




i 




























AlwNI 




3741 


GAC 


TGc 


get 


ttc 


cat 


tct 


ggc 


ttt aat 


gaa 


gat 


cca 


ttc 


gtt 


tgt 


gaa 




j 


AlwNI 






























3789 


tat 


caa 


ggc 


caa 


teg 


tct 


gac 


ctg cct 


caa 


cct 


cct 


gtc 


aat 


get 







3834 ggc ggc ggc tct 

! start L2 

3846 ggt ggt ggt tct 

3858 ggt ggc ggc tct 

3870 gag ggt ggt ggc tct gag ggt ggc ggt tct 

3900 gag ggt ggc ggc tct gag gga ggc ggt tec 

3930 ggt ggt ggc tct ggt " ! end L2 



Domain 3 



3945 

j 


S 
tee 


G 
ggt 


D 
gat 


F 
ttt 


D 
gat 


Y 
tat 


E 
gaa 


K 
aag 


M 
atg 


A 

gca 


N 
aac 


A 
get 


N 
aat 


K 
aag 


G 

ggg 


A 

get 


3993 

j 


M 
atg 


T 
acc 


E 
gaa 


N 
aat 


A 
gee 


D 

gat 


E 

gaa 


N 
aac 


A 
gcg 


L 
eta 


Q 
cag 


S 
tct 


D 
gac 


A 
get 


K 
aaa 


G 
ggc 


1 

4041 

j 


K 
aaa 


L 
ctt 


D 
gat 


S 
tct 


V 
gtc 


A 

get 


T 

act 


D 
gat 


Y 
tac 


G 
ggt 


A 
get 


A 

get 


I 
ate 


D 
gat 


G 
ggt 


F 

ttc 


4089 

i 


I 

att 


G 
ggt 


D 

gac 


V 
gtt 


S 
tec 


G 
ggc 


L 
ctt 


A 
get 


N 
aat 


G 
ggt 


N 
aat 


G 
ggt 


A 
get 


T 
act 


G 
ggt 


D 
gat 


i 

4137 

j 


F 
ttt 


A 
get 


G 
ggc 


S 
tct 


N 
aat 


S 
tec 


Q 
caa 


M 
atg 


A 
get 


Q 
caa 


V 
gtc 


G 
ggt 


D 
gac 


G 
ggt 


D 
gat 


N 
aat 


4185 


S 
tea 


P 
cct 


L 
tta 


M 
atg 


N 
aat 


N 
aat 


F 
ttc 


R 
cgt 


Q 
caa 


Y 
tat 


L 
tta 


P 
cct 


S 
tec 


L 
etc 


P 
cct 


Q 
caa 
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S 


V 


E 


C 


R 


P 


F 


V 


F 


S 


A 


G 


K 


P 


Y 


E 


4233 


teg 


gtt 


gaa 


tgt 


cgc 


cct 


ttt 


ate 


ttt 


age 


get 


ggt 


aaa 


cca 


tat 


gaa 




F 


S 


I 


D 


C 


D 


K 


I 


N 


L 


F 


R 











4281 


ttt 


tct 


att 


gat 


tgt 


gac 


aaa 


ata 


aac 


tta 


ttc 


cgt 


































End 


Domain 


3 






G 


V 


F 


A 


F 


L 


L 


Y 


V 


A 


T 


F 


M 


Y 


V 


F14 


4317 


ggt 


gtc 


ttt 


gcg 


ttt 


ctt 


tta 


tat 


gtt 


gee 


acc 


ttt 


atq 


tat 


gta 


ttt 




start transmembrane 


segment 






















S 


T 


F 


A 


N 


I 


L 




















4365 


tct 


acg 


ttt 


get 


aac 


ata 


ctg 






















R 


N 


K 


E 


S 
























4386 


cgt 


aat 


aag 


gag 


tct 


TAA 


! stop of iii 














Intracellular anchor. 


























Ml 


P2 


V 


L 


L5 


G 


I 


P 


L 


L10 


L 


R 


F 


L 


G15 


4404 


tc 


ATG 


cca 


gtt 


ctt 


ttg 


ggt 


att 


ccg 


tta 


tta 


ttg 


cgt 


ttc 


etc 


ggt 






Start VI 


























4451 


ttc 


ctt 


ctg gta 


act 


ttg 


ttc 


ggc 


tat 


ctg 


ctt 


act 


ttt 


ctt 


aaa 


aag 


4499 


ggc 


ttc 


ggt 


aag 


ata 


get 


att get 


att 


tea 


ttg 


ttt 


ctt 


get 


ctt 


att 


4547 


att 


ggg 


ctt 


aac 


tea 


att 


ctt gtg 


ggt 


tat 


etc 


tct 


gat 


att 


age 


get 


A C Ci C 

4595 


caa 


tta 


ccc 


tct 


gac 


ttt 


gtt 


cag 


ggt 


gtt 


cag 


tta 


att 


etc 


ccg 


tct 


4643 


aat 


gcg 


ctt 


ccc 


tgt 


ttt 


tat gtt 


att 


etc 


tct 


gta 


aag 


get 


get 


att 


4691 


ttc 


att 


ttt 


gac gtt 


aaa 


caa 


aaa 


ate 


gtt 


tct 


tat 


ttg 


gat 


tgg 


gat 








Ml A2 V3 


F5 








L10 




G13 


4739 


aaa 


TAA 


t ATG get gtt tat ttt gta act ggc aaa tta ggc tct gga 




end VI 


Start 


gene I 
























14 


15 


16 


17 


18 


19 


20 


21 


22 


23 


24 


25 


26 


27 


28 






K 


T 


L 


V 


S 


V 


G 


K 


I 


Q 


D 


K 


I 


V 


A 




4785 


aag 


acg 


etc 


gtt 


age 


gtt 


ggt 


aag 


att 


cag 


gat 


aaa 


att 


gta 


get 






29 


30 


31 


32 


33 


34 


35 


36 


37 


38 


39 


40 


41 


42 


43 






G 


C 


K 


I 


A 


T 


N 


L 


D 


L 


R 


L 


Q 


N 


L 




4830 


ggg 


tgc 


aaa 


ata 


gca 


act 


aat 


ctt 


gat 


tta 


agg 


ctt 


caa 


aac 


etc 






44 


45 


46 


47 


48 


49 


50 


51 


52 


53 


54 


55 


56 


57 


58 






p 


Q 


V 


G 


R 


F 


A 


K 


T 


P 


R 


V 


L 


R 


I 




4875 


ccg 


caa 


gtc 


ggg 


agg 


ttc 


get 


aaa 


acg 


cct 


cgc 


gtt 


ctt 


aga 


ata 






59 


60 


61 


62 


63 


64 


65 


66 


67 


68 


69 


70 


71 


72 


73 






P 


D 


K 


P 


S 


I 


S 


0 


L 


L 


A 


I 


G 


R 


G 




4 920 


ccg 


gat 


aag 


cct 


tct 


ata 


tct 


gat 


ttg 


ctt 


get 


att 


ggg 


cgc 


ggt 






74 


75 


76 


77 


78 


79 


80 


81 


82 


83 


84 


85 


86 


87 


88 






N 


D 


S 


Y 


D 


E 


N 


K 


N 


G 


L 


L 


V 


L 


D 




4965 


aat 


gat 


tec 


tac 


gat 


gaa 


aat 


aaa 


aac 


ggc 


ttg 


ctt 


gtt 


etc 


gat 






89 


90 


91 


92 


93 


94 


95 


96 


97 


98 


99 


100 


101 


102 


103 






E 


C 


G 


T 


W 


F 


N 


T 


R 


S 


W 


N 


D 


K 


E 




5010 


gag 


tgc 


ggt 


act 


tgg 


ttt 


aat 


ace 


cgt 


tct 


tgg 


aat 


gat 


aag 


gaa 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 
* Q r I iuhflhark-l — <*■ 
5055 aga cag ccg att att gat tgg ttt eta cat get cgt aaa tta gga 

119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 

WDIIFLVQDLSIVDK 
5100 tgg gat att att ttt ctt. gtt cag gac tta tct att gtt gat aaa 

134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 
QARSALAEHVVYCRR 
514 5 cag gcg cgt tct gca tta get gaa cat gtt gtt tat tgt cgt cgt 

149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 

LDRITLPFVGTLYSL 
5190 ctg gac aga att act tta cct ttt gtc ggt act tta tat tct ctt 

164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 
ITGSKMPLPKLHVGV 
5235 att act ggc teg aaa atg cct ctg cct aaa tta cat gtt ggc gtt 

179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 
VKYGDSQLSPTVERW 
5280 gtt aaa tat ggc gat tct caa tta age cct act gtt gag cgt tgg 

194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 
LYTGKNLYNAYDTKQ 
5325 ctt tat act ggt aag aat ttg tat aac gca tat gat act aaa cag 

209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 
AFSSNYDSGVYSYLT 
5370 get ttt tct agt aat tat gat tec ggt gtt tat tct tat tta acg 

224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 
PYLSHGRYFKPLNLG 
5415 cct tat tta tea cac ggt egg tat ttc aaa cca tta aat tta ggt 

239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 
QKMKLTKIYLK KFSR 
54 60 cag aag atg aaa tta act aaa ata tat ttg aaa aag ttt tct cgc 

254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 
VLCLAIGFASAFTYS 
5505 gtt ctt tgt ctt gcg att gga ttt gca tea gca ttt aca tat agt 

269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 
YITQPKPEVKKVVSQ 
5550 tat ata acc caa cct aag ccg gag gtt aaa aag gta gtc tct cag 

284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 
TYDFDKFTIDSSQRL 
b595 acc tat gat ttt gat aaa ttc act att gac tct tct cag cgt ctt 

299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 
NLSYRYVFKDSKGKL 
5640 aat eta age tat cgc tat gtt ttc aag gat tct aag gga aaa TTA 

Pad 
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! 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 

! INSDDLQKQGYSLTY 

5685 ATT AAt age gac gat tta cag aag caa ggt tat tea etc aca tat 

Pad . . — .. 



10 



329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 
il DLCTVS I KKGN SNE 
iv Ml K 

5730 att gat tta tgt act gtt tec att aaa aaa ggt aat tea aAT Gaa 

Start IV 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



344 345 346 347 348 349 
i I V K C N .End of I 
iv L3 L N5 V 17 N F V10 
5775 att gtt aaa tgt aat TAA T TTT GTT 

IV continued 

5800 ttc ttg atg ttt gtt tea tea tct tct ttt get 
5848 aat aat teg cct ctg cgc gat ttt gta act tgg 
5896 ggc gaa tec gtt att gtt tct ccc gat gta aaa 
5944 tat tea tct gac gtt aaa cct gaa aat eta cgc 
5992 gtt tta cgt get aat aat ttt gat atg gtt ggt 
6040 att cag aag tat aat cca aac aat cag gat tat 
6088 tea tct gat aat cag gaa tat gat gat aat tec 
6136 ttc ttt gtt ccg caa aat gat aat gtt act caa 



tta 

cag 
tta 
ggt 
teg 
tta 
acg 



cag 
tat 
ggt 
aat 
tea 
att 
get 
act 
gaa 
att 
gat 

ata 
gat 
gtt 
ttc 
aag 
ctt 
att 



gta 
tea 
act 
ttc 
att 
gat 
cct 
ttt 
ttg 
gac 
aac 

ttg 
ttt 
aat 
ggt 
act 
tea 
act 



att gaa 
aag caa 
gtt act 
ttt att 
cct tec 
gaa ttg 
tct ggt 
aaa att 
ttt gta 
ggc tct 
ctt cct 



atg 
tea 
gta 
tct 
ata 
cca 
ggt 
aat 
aag 
aat 
caa 



att 
tea 
act 
att 
aat 
ggt 
ggt 



gag ggt 
ttt get 
gac cgc 
ttt aat 
age cat 
cag aag 
cgt gtg 



6184 aac gtt egg gca aag gat tta ata cga gtt gtc 
6232 tct aat act tct aaa tec tea aat gta tta tct 
6280 eta tta gtt gtt TCT gca cct aaa gat att 

ApaLI removed 
6328 ttc ctt tct act gtt gat ttg cca act gac 
6376 ttg ata ttt gag gtt cag caa ggt gat get 
6424 get ggc tct cag cgt ggc act gtt gca ggc 
6472 etc ace tct gtt tta tct tct get ggt ggt 
6520 ggc gat gtt tta ggg eta tea gtt cgc gca 
6568 tea aaa ata ttg tct gtg cca cgt att ctt 
6616 ggt tct ate tct gtT GGC CAg aat gtc cct ttt 

MscI 

6664 act ggt gaa tct gec aat gta aat aat cca ttt 
6712 caa aat gta ggt att tec atg age gtt ttt cct 
6760 ggt aat att gtt ctg gat att acc age aag gec 
6808 tct act cag gca agt gat gtt att act aat caa 
6856 acg gtt aat ttg cgt gat gga cag act ctt tta 
6904 gat tat aaa aac act tct caa gat tct ggc gta 
6952 ate cct tta ate ggc etc ctg ttt age tec cgc 
7000 gaa age acg tta tac gtg etc gtc aaa gca acc 
7048 TAG eggegcatt 
End IV 

7060 aagcgcggcg ggtgtggtgg ttacgcgcag cgtgaccgct 
7120 gcccgctcct ttegctttet tcccttcctt tctcgccacg 

7180 agctctaaat cgggggctcc ctttagggtt ccgatttagt 
7240 caaaaaactt gatttgggtg atggttCACG TAGTGggcca 

Drain 

7300 tcgccctttG ACGTTGGAGT Ccacgttctt taatagtgga ctcttgttcc aaactggaac 

DrdI 

7360 aacactcaac cctatctcgg gctattcttt tgatttataa 



cag acg 
gtt gca 
gat agt 
aga agt 
etc ggt 
ccg ttc 
tct gat 
ata gta 



att gag cgt 
atg get ggc 
ttg agt tct 
att get aca 
ggc etc act 
ctg tct aaa 
tec aac gag 
cgc gec ctg 



acacttgcca gcgccctagc 
ttcGCCGGCt ttccccgtca 

NgoMI_ 
getttaegge acctcgaccc 
tcgccctgat agacggtttt 



7420 accaccatca aacaggattt tcgcctgctg gggcaaacca 
7480 ctctctcagg gecaggeggt gaagggcaat CAGCTGttgc 

PvuII. 

7540 aaaaccaccc tGGATCC AAGCTT 

BamHI Hindlll (H) 



gggattttgc egatttegga 

gcgtggaccg ettgetgeaa 

cCGTCTCact ggtgaaaaga 
BsmBI . 
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10 



15 



20 



25 ! 



30 



35 



40 ! 



7563 
7600 



7695 
7746 
7797 



7848 
7899 
7950 

8001 

8052 
8103 

8154 
8205 
8256 



Insert carrying bla gene 
gcaggtg gcacttttcg gggaaatgtg cgcggaaccc 
ctatttgttt atttttctaa atacattcaa atatGTATCC gctcatgaga caataaccct 

BciVI 



.aaa i_y v. 



Start bla gene 
ATG agt att caa 
tgc ctt cct gtt 
gaa gat cag ttg 



ggt aag ate ctt 
act ttt aaa gtt 
caa gaG CAA CTC 
Bcgl 



RBS . ? . 



cat ttc cgt gtc 
ttt get cac cca 
ggC gCA CGA Gtg 
BssSI. . . 
ApaLI removed 
gag agt ttt cgc 
ctg eta tgt cat 
GGT CGc egg gcg 



gee ctt 
gaa acg 
ggt tac 



ccc gaa 
aca eta 
egg tat 



TAC Tea cca gtc 
Scal_ 

tta tgc agt get 
ctg aca aCG ATC 

Pvul 

ggg gat cat gta 
ata cca aac gac 
tTG CGC Aaa eta 
Fspl .... 



aca gaa aag cat ctt acg 

gee ata ace atg agt gat 
Gga gga ccg aag gag eta 



act cgc ctt gat cgt tgg 
gag cgt gac ace acg atg 
tta act ggc gaa eta ctt 



Bgll_ 



Bsal 



Ahdl 



att 


ccc 


ttt 


ttt 


gcg 


gca 


ttt 


ctg 


gtg 


aaa 


gta 


aaa 


gat 


get 


ate 


gaa 


ctg 


gat 


etc 


aac 


age 


gaa 


cgt 


ttt 


cca 


atg 


atg 


age 


tta 


tec 


cgt 


att 


gac 


gee 


ggg 


tct 


cag 


aat 


gac 


ttg 


gtt 


gAG 














Seal 


gat 


ggc 


atg 


aca 


gta 


aga 


gaa 


aac 


act 


gcg 


gee 


aac 


tta 


ctt 


ace 


get 


ttt 


ttg 


cac 


aac 


atg 


gaa 


ccg 


gag 


ctg 


aat 


gaa 


gec 


cct 


gta 


gca 


atg 


cca 


aca 


acg 


act 


eta 


get 


tec 


egg 


caa 


caa 


gca 


gga 


cca 


ctt 


ctg 


cgc 


teg 


aaa 


tct 


gga 


gee 


ggt 


gag 


cgt 


cca 


gat 


ggt 


aag 


ccc 


tec 


cgt 


gca 


act 


atg 


gat 


gaa 


cga 


aat 


att 


aag 


cat 


tgg 


TAA 


ctgt 



8511 aga cag ate get gag ata ggt gec tea ctg att 

stop 

8560 cagaccaagt ttactcatat atactttaga ttgatttaaa acttcatttt taatttaaaa 
8620 ggatctaggt gaagatcctt tttgataatc tcatgaccaa aatcccttaa cgtgagtttt 
8680 cgttccactg taegtaagae cccc 
8704 AAGCTT GTCGAC tgaa tggcgaatgg cgctttgcct 

Hindlll Sail.. 

(2/2) Hindi 

8740 ggtttccggc accagaagcg gtgccggaaa gctggctgga gtgegatett 



8790 CCTGAGG 

45 ! Bsu36l_ 

8797 ccgat actgtegteg tcccctcaaa ctggcagatg 

8832 cacggttacg atgcgcccat ctacaccaac gtaacctatc 
8892 tttgttccca eggagaatec gacgggttgt tactcgctca 
8952 tggctacagg aaggecagae gcgaattatt tttgatggcg 

50 9012 agctgattta acaaaaattt aacgegaatt ttaacaaaat 
i 

9072 Tatttgctta tacaatcttc ctgtttttgg ggcttttctg 

i 

9131 ATG att gac atg eta gtt tta cga tta ccg ttc 
55 ! Start gene II 

9182 tec aga etc tea ggc aat gac ctg ata gee ttt 

j 

9233 get acc etc tec ggc atg aat tta tea get aga 



ecattaeggt caatccgccg 
catttaatgt tgatgaaagc 
ttcctattgg ttaaaaaatg 
attaacgttt acaATTTAAA 
Swal. . . 
attatcaacc GGGGTAcat 
RBS? 

ate gat tct ctt gtt tgc 

gtA GAT CTc tea aaa ata 

Bglll. . . 
acg gtt gaa tat cat att 
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9284 gat ggt gat ttg act 
9335 aca cat tac tea ggc 
9386 tat cct tgc gtt gaa 
9437 aat gtt ttt ggt aca 
9488 aat ttt get aat tct 
gene II continues 



gtc tec ggc ctt tct cac 

att gca ttt aaa ata tat 

ata aag get tct ccc gca 

ace gat tta get tta tgc 

ttg cct tgc ctg tat gat 



cct ttt gaa tct tta cct 
gag ggt tct aaa aat ttt 
aaa gta tta cag ggt cat 
tct gag get tta ttg ctt 
tta ttg gat .-gtt ! 9532 
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Table 21B: Sequence of MALIA3, condensed 



LOCUS MALI A3 9532 CIRCULAR 
ORIGIN 

— - — -» ww»««w**» w j.* * A j.a iw a* »w JUli iwiiivjw *-*v-» a *. x x ^nvj w i ^ocov.cUVfV nnrt x orvrt/A/A 1 

5 61 ATAGCTAAAC AGGTTATTGA CCATTTGCGA AATGTATCTA ATGGTCAAAC TAAATCTACT 

121 CGTTCGCAGA ATTGGGAATC AACTGTTACA TGGAATGAAA CTTCCAGACA CCGTACTTTA 

181 GTTGCATATT TAAAACATGT TGAGCTACAG CACCAGATTC AGCAATTAAG CTCTAAGCCA 

241 TCCGCAAAAA TGACCTCTTA TCAAAAGGAG CAATTAAAGG TACTCTCTAA TCCTGACCTG 

361 TCTTTCGGGC TTCCTCTTAA TCTTTTTGAT GCAATCCGCT TTGCTTCTGA CTATAATAGT 

10 421 CAGGGTAAAG ACCTGATTTT TGATTTATGG TCATTCTCGT TTTCTGAACT GTTTAAAGCA 

481 TTTGAGGGGG ATTCAATGAA TATTTATGAC GATTCCGCAG TATTGGACGC TATCCAGTCT 

541 AAACATTTTA CTATTACCCC CTCTGGCAAA ACTTCTTTTG CAAAAGCCTC TCGCTATTTT 

601 GGTTTTTATC GTCGTCTGGT AAACGAGGGT TATGATAGTG TTGCTCTTAC TATGCCTCGT 

661 AATTCCTTTT GGCGTTATGT ATCTGCATTA GTTGAATGTG GTATTCCTAA ATCTCAACTG 

15 721 ATGAATCTTT CTACCTGTAA TAATGTTGTT CCGTTAGTTC GTTTTATTAA CGTAGATTTT 

781 TCTTCCCAAC GTCCTGACTG GTATAATGAG CCAGTTCTTA AAATCGCATA AGGTAATTCA 

841 CAATGATTAA AGTTGAAATT AAACCATCTC AAGCCCAATT TACTACTCGT TCTGGTGTTT 

901 CTCGTCAGGG CAAGCCTTAT TCACTGAATG AGCAGCTTTG TTACGTTGAT TTGGGTAATG 

961 AATATCCGGT TCTTGTCAAG ATTACTCTTG ATGAAGGTCA GCCAGCCTAT GCGCCTGGTC 

20 1021 TGTACACCGT TCATCTGTCC TCTTTCAAAG TTGGTCAGTT CGGTTCCCTT ATGATTGACC 

1081 GTCTGCGCCT CGTTCCGGCT AAGTAACATG GAGCAGGTCG CGGATTTCGA CACAATTTAT 

1141 CAGGCGATGA TACAAATCTC CGTTGTACTT TGTTTCGCGC TTGGTATAAT CGCTGGGGGT 

1201 CAAAGATGAG TGTTTTAGTG TATTCTTTCG CCTCTTTCGT TTTAGGTTGG TGCCTTCGTA 

1261 GTGGCATTAC GTATTTTACC CGTTTAATGG AAACTTCCTC ATGAAAAAGT CTTTAGTCCT 

25 1321 CAAAGCCTCT GTAGCCGTTG CTACCCTCGT TCCGATGCTG TCTTTCGCTG CTGAGGGTGA 

1381 CGATCCCGCA AAAGCGGCCT TTAACTCCCT GCAAGCCTCA GCGACCGAAT ATATCGGTTA 

1441 TGCGTGGGCG ATGGTTGTTG TCATTGTCGG CGCAACTATC GGTATCAAGC TGTTTAAGAA 

1501 ATTCACCTCG AAAGCAAGCT GATAAACCGA TACAATTAAA GGCTCCTTTT GGAGCCTTTT 

1561 TTTTTGGAGA TTTTCAACGT GAAAAAATTA TTATTCGCAA TTCCTTTAGT TGTTCCTTTC 

30 1621 TATTCTCACA GTGCACAGTC TGTCGTGACG CAGCCGCCCT CAGTGTCTGG GGCCCCAGGG 

1681 CAGAGGGTCA CCATCTCCTG CACTGGGAGC AGCTCCAACA TCGGGGCAGG TTATGATGTA 

1^41 CACTGGTACC AGCAGCTTCC AGGAACAGCC CCCAAACTCC TCATCTATGG TAACAGCAAT 

1801 CGGCCCTCAG GGGTCCCTGA CCGATTCTCT GGCTCCAAGT CTGGCACCTC AGCCTCCCTG 

1861 GCCATCACTG GGCTCCAGGC TGAGGATGAG GCTGATTATT ACTGCCAGTC CTATGACAGC 

35 1921 AGCCTGAGTG GCCTTTATGT CTTCGGAACT GGGACCAAGG TCACCGTCCT AGGTCAGCCC 

1981 AAGGCCAACC CCACTGTCAC TCTGTTCCCG CCCTCCTCTG AGGAGCTCCA AGCCAACAAG 

2041 GCCACACTAG TGTGTCTGAT CAGTGACTTC TACCCGGGAG CTGTGACAGT GGCCTGGAAG 

2101 GCAGATAGCA GCCCCGTCAA GGCGGGAGTG GAGACCACCA CACCCTCCAA ACAAAGCAAC 

2161 AACAAGTACG CGGCCAGCAG CTATCTGAGC CTGACGCCTG AGCAGTGGAA GTCCCACAGA 

40 2221 AGCTACAGCT GCCAGGTCAC GCATGAAGGG AGCACCGTGG AGAAGACAGT GGCCCCTACA 

2281 GAATGTTCAT AATAAACCGC CTCCACCGGG CGCGCCAATT CTATTTCAAG GAGACAGTCA 

2341 TAATGAAATA CCTATTGCCT ACGGCAGCCG CTGGATTGTT ATTACTCGCG GCCCAGCCGG 

2401 CCATGGCCGA AGTTCAATTG TTAGAGTCTG GTGGCGGTCT TGTTCAGCCT GGTGGTTCTT 

24 61 TACGTCTTTC TTGCGCTGCT TCCGGATTCA CTTTCTCTTC GTACGCTATG TCTTGGGTTC 

45 2521 GCCAAGCTCC TGGTAAAGGT TTGGAGTGGG TTTCTGCTAT CTCTGGTTCT GGTGGCAGTA 

2581 CTTACTATGC TGACTCCGTT AAAGGTCGCT TCACTATCTC TAGAGACAAC TCTAAGAATA 

2641 CTCTCTACTT GCAGATGAAC AGCTTAAGGG CTGAGGACAC TGCAGTCTAC TATTGCGCTA 

2701 AAGACTATGA AGGTACTGGT TATGCTTTCG ACATATGGGG TCAAGGTACT ATGGTCACCG 

2761 TCTCTAGTGC CTCCACCAAG GGCCCATCGG TCTTCCCCCT GGCACCCTCC TCCAAGAGCA 

50 2821 CCTCTGGGGG CACAGCGGCC CTGGGCTGCC TGGTCAAGGA CTACTTCCCC GAACCGGTGA 

2881 CGGTGTCGTG GAACTCAGGC GCCCTGACCA GCGGCGTCCA CACCTTCCCG GCTGTCCTAC 

2941 AGTCTAGCGG ACTCTACTCC CTCAGCAGCG TAGTGACCGT GCCCTCTTCT AGCTTGGGCA 

3001 CCCAGACCTA CATCTGCAAC GTGAATCACA AGCCCAGCAA CACCAAGGTG GACAAGAAAG 

3061 TTGAGCCCAA ATCTTGTGCG GCCGCTCATC ACCACCATCA TCACTCTGCT GAACAAAAAC 

55 3121 TCATCTCAGA AGAGGA7CTG AATGGTGCCG CAGATATCAA CGATGATCGT ATGGCTGGCG 

3181 CCGCTGAAAC TGTTGAAAGT TGTTTAGCAA AACCCCATAC AGAAAATTCA TTTACTAACG 

3241 TCTGGAAAGA CGACAAAACT TTAGATCGTT ACGCTAACTA TGAGGGTTGT CTGTGGAATG 
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3301 CTACAGGCGT TGTAGTTTGT ACTGGTGACG AAACTCAGTG TTACGGTACA TGGGTTCCTA 

3361 TTGGGCTTGC TATCCCTGAA AATGAGGGTG GTGGCTCTGA GGGTGGCGGT TCTGAGGGTG 

3421 GCGGTTCTGA GGGTGGCGGT ACTAAACCTC CTGAGTACGG TGATACACCT ATTCCGGGCT 

3481 ATACTTATAT CAACCCTCTC GACGGCACTT ATCCGCCTGG TACTGAGCAA AACCCCGCTA 

5 3541 ATCCTAATCC TTCTCTTGAG GAGTCTCAGC CTCTTAATAC TTTCATGITT CAGAATAATA 

3601 GGTTCCGAAA TAGGCAGGGG GCATTAACTG TTTATACGGG CACTGTTACT CAAGGCACTG 

3661 ACCCCGTTAA AACTTATTAC CAGTACACTC CTGTATCATC AAAAGCCATG TATGACGCTT 

3721 ACTGGAACGG TAAATTCAGA GACTGCGCTT TCCATTCTGG CTTTAATGAA GATCCATTCG 

3781 TTTGTGAATA TCAAGGCCAA TCGTCTGACC TGCCTCAACC TCCTGTCAAT GCTGGCGGCG 

10 3841 GCTCTGGTGG TGGTTCTGGT GGCGGCTCTG AGGGTGGTGG CTCTGAGGGT GGCGGTTCTG 

3901 AGGGTGGCGG CTCTGAGGGA GGCGGTTCCG GTGGTGGCTC TGGTTCCGGT GATTTTGATT 

3961 ATGAAAAGAT GGCAAACGCT AATAAGGGGG CTATGACCGA AAATGCCGAT GAAAACGCGC 

4021 TACAGTCTGA CGCTAAAGGC AAACTTGATT CTGTCGCTAC TGATTACGGT GCTGCTATCG 

4081 ATGGTTTCAT TGGTGACGTT TCCGGCCTTG CTAATGGTAA TGGTGCTACT GGTGATTTTG 

15 4141 CTGGCTCTAA TTCCCAAATG GCTCAAGTCG GTGACGGTGA TAATTCACCT TTAATGAATA 

4201 ATTTCCGTCA ATATTTACCT TCCCTCCCTC AATCGGTTGA ATGTCGCCCT TTTGTCTTTA 

4261 GCGCTGGTAA ACCATATGAA TTTTCTATTG ATTGTGACAA AATAAACTTA TTCCGTGGTG 

4321 TCTTTGCGTT TCTTTTATAT GTTGCCACCT TTATGTATGT ATTTTCTACG TTTGCTAACA 

4381 TACTGCGTAA TAAGGAGTCT TAATCATGCC AGTTCTTTTG GGTATTCCGT TATTATTGCG 

20 4441 TTTCCTCGGT TTCCTTCTGG TAACTTTGTT CGGCTATCTG CTTACTTTTC TTAAAAAGGG 

4501 CTTCGGTAAG ATAGCTATTG CTATTTCATT GTTTCTTGCT CTTATTATTG GGCTTAACTC 

4561 AATTCTTGTG GGTTATCTCT CT GAT ATT AG CGCTCAATTA CCCTCTGACT TTGTTCAGGG 

4621 TGTTCAGTTA ATTCTCCCGT CTAATGCGCT TCCCTGTTTT TATGTTATTC TCTCTGTAAA 

4 681 GGCTGCTATT TTCATTTTTG ACGTTAAACA AAAAATCGTT TCTTATTTGG ATTGGGATAA 

25 4741 ATAATATGGC TGTTTATTTT GTAACTGGCA AATTAGGCTC TGGAAAGACG CTCGTTAGCG 

4 801 TTGGTAAGAT TCAGGATAAA ATTGTAGCTG GGTGCAAAAT AGCAACTAAT CTTGATTTAA 

4861 GGCTTCAAAA CCTCCCGCAA GTCGGGAGGT TCGCTAAAAC GCCTCGCGTT CTTAGAATAC 

4 921 CGGATAAGCC TTCTATATCT GATTTGCTTG CTATTGGGCG CGGTAATGAT TCCTACGATG 

4 981 AAAATAAAAA CGGCTTGCTT GTTCTCGATG AGTGCGGTAC TTGGTTTAAT ACCCGTTCTT 

30 5041 GGAATGATAA GGAAAGACAG CCGATTATTG ATTGGTTTCT ACATGCTCGT AAATTAGGAT 

5101 GGGATATTAT TTTTCTTGTT CAGGACTTAT CTATTGTTGA TAAACAGGCG CGTTCTGCAT 

5161 TAGCTGAACA TGTTGTTTAT TGTCGTCGTC TGGACAGAAT TACTTTACCT TTTGTCGGTA 

5221 CTTTATATTC TCTTATTACT GGCTCGAAAA TGCCTCTGCC TAAATTACAT GTTGGCGTTG 

5281 TTAAATATGG CGATTCTCAA TTAAGCCCTA CTGTTGAGCG TTGGCTTTAT ACTGGTAAGA 

35 5341 ATTTGTATAA CGCATATGAT ACTAAACAGG CTTTTTCTAG TAATTATGAT TCCGGTGTTT 

5401 ATTCTTATTT AACGCCTTAT TTATCACACG GTCGGTATTT CAAACCATTA AATTTAGGTC 

54 61 AGAAGATGAA ATTAACTAAA ATATATTTGA AAAAGTTTTC TCGCGTTCTT TGTCTTGCGA 

5521 TTGGATTTGC ATCAGCATTT ACATATAGTT ATATAACCCA ACCTAAGCCG GAGGTTAAAA 

5581 AGGTAGTCTC TCAGACCTAT GATTTTGATA AATTCACTAT TGACTCTTCT CAGCGTCTTA 

40 5641 ATCTAAGCTA TCGCTATGTT TTCAAGGATT CTAAGGGAAA ATTAATTAAT AGCGACGATT 

5701 TACAGAAGCA AGGTTATTCA CTCACATATA TTGATTTATG TACTGTTTCC ATTAAAAAAG 

57 61 GTAATTCAAA TGAAATTGTT AAATGTAATT AATTTTGTTT TCTTGATGTT TGTTTCATCA 

5821 TCTTCTTTTG CTCAGGTAAT TGAAATGAAT AATTCGCCTC TGCGCGATTT TGTAACTTGG 

5881 TATTCAAAGC AATCAGGCGA ATCCGTTATT GTTTCTCCCG ATGTAAAAGG TACTGTTACT 

4 5 5941 GTATATTCAT CTGACGTTAA ACCTGAAAAT CTACGCAATT TCTTTATTTC TGTTTTACGT 

6001 GCTAATAATT TTGATATGGT TGGTTCAATT CCTTCCATAA TTCAGAAGTA TAATCCAAAC 

6061 AATCAGGATT ATATTGATGA ATTGCCATCA TCTGATAATC AGGAATATGA TGATAATTCC 

6121 GCTCCTTCTG GTGGTTTCTT TGTTCCGCAA AATGATAATG TTACTCAAAC TTTTAAAATT 

6181 AATAACGTTC GGGCAAAGGA TTTAATACGA GTTGTCGAAT TGTTTGTAAA GTCTAATACT 

50 6241 TCTAAATCCT CAAATGTATT ATCTATTGAC GGCTCTAATC TATTAGTTGT TTCTGCACCT 

6301 AAAGATATTT TAGATAACCT TCCTCAATTC CTTTCTACTG TTGATTTGCC AACTGACCAG 

6361 ATATTGATTG AGGGTTTGAT ATTTGAGGTT CAGCAAGGTG ATGCTTTAGA TTTTTCATTT 

6421 GCTGCTGGCT CTCAGCGTGG CACTGTTGCA GGCGGTGTTA ATACTGACCG CCTCACCTCT 

6481 GTTTTATCTT CTGCTGGTGG TTCGTTCGGT ATTTTTAATG GCGATGTTTT AGGGCTATCA 

55 6541 GTTCGCGCAT TAAAGACTAA TAGCCATTCA AAAATATTGT CTGTGCCACG TATTCTTACG 

6601 CTTTCAGGTC AGAAGGGTTC TATCTCTGTT GGCCAGAATG TCCCTTTTAT TACTGGTCGT 

6661 GTGACTGGTG AATCTGCCAA TGTAAATAAT CCATTTCAGA CGATTGAGCG TCAAAATGTA 

6721 GGTATTTCCA TGAGCGTTTT TCCTGTTGCA ATGGCTGGCG GTAATATTGT TCTGGATATT 
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10 



15 



20 



25 



30 



35 



40 



45 



6781 
6841 
6901 
6961 
7021 
7081 
7141 
7201 
7261 
7321 
7381 
7441 
7501 
7561 
7621 
7681 
7741 
7801 
7861 
7921 
7981 
8041 
8101 
8161 
8221 
8281 
8341 
8401 
8461 
8521 
8581 
8641 
8701 
8761 
8821 
8881 
8941 
9001 
9061 
9121 
9181 
9241 
9301 
9361 
9421 
9481 



ACCAGCAAGG 
AGAAGTATTG 
ACTGATTATA 
ATCGGCCTCC 
GTGAAAGGAA 
TACGCGCAGC 
CCCTTCCTTT 
TTTAGGGTTC 
TGGTTCACGT 
CACGTTCTTT 
CTATTCTTTT 
CGCCTGCTGG 
AAGGGCAATC 
TTGCAGGTGG 
TACATTCAAA 
GAAAAAGGAA 
CATTTTGCCT 
ATCAGTTGGG 
AGAGTTTTCG 
ATACACTATT 
CTCAGAATGA 
CAGTAAGAGA 
TTCTGACAAC 
ATGTAACTCG 
GTGACACCAC 
TACTTACTCT 
GACCACTTCT 
GTGAGCGTGG 
TCGTAGTTAT 
CTGAGATAGG 
TACTTTAGAT 
TTGATAATCT 
CCCAAGCTTG 
TGCCGGAAAG 
ACTGGCAGAT 
TCAATCCGCC 
TTGATGAAAG 
GTTAAAAAAT 
TACAATTTAA 
CGGGGTACAT 
CTCCAGACTC 
CTCCGGCATG 
CTCCGGCCTT 
AATATATGAG 
AGTATTACAG 
ATTGCTTAAT 



CCGATAGTTT 
CTACAACGGT 
AAAACACTTC 
TGTTTAGCTC 

^*Xdr« 14 x a now 

GTGACCGCTA 

CTCGCCACGT 

CGATTTAGTG 

AGTGGGCCAT 

AATAGTGGAC 

GATTTATAAG 

GGCAAACCAG 

AGCTGTTGCC 

CACTTTTCGG 

TATGTATCCG 

GAGTATGAGT 

TCCTGTTTTT 

CGCACGAGTG 

CCCCGAAGAA 

ATCCCGTATT 

CTTGGTTGAG 

ATTATGCAGT 

GATCGGAGGA 

CCTTGATCGT 

GATGCCTGTA 

AGCTTCCCGG 

GCGCTCGGCC 

GTCTCGCGGT 

CTACACGACG 

TGCCTCACTG 

TGATTTAAAA 

CATGACCAAA 

TCGACTGAAT 

CTGGCTGGAG 

GCACGGTTAC 

GTTTGTTCCC 

CTGGCTACAG 

GAGCTGATTT 

ATATTTGCTT 

ATGATTGACA 

TCAGGCAATG 

AATTTATCAG 

TCTCACCCTT 

GGTTCTAAAA 

GGTCATAATG 

TTTGCTAATT 



GAGTTCTTCT 
TAATTTGCGT 
TCAAGATTCT 
CCGCTCTGAT 

CACTTGCCAG 
TCGCCGGCTT 
CTTTACGGCA 
CGCCCTGATA 
TCTTGTTCCA 
GGATTTTGCC 
CGTGGACCGC 
CGTCTCACTG 
GGAAATGTGC 
CTCATGAGAC 
ATTCAACATT 
GCTCACCCAG 
GGTTACATCG 
CGTTTTCCAA 
GACGCCGGGC 
TACTCACCAG 
GCTGCCATAA 
CCGAAGGAGC 
TGGGAACCGG 
GCAATGCCAA 
CAACAATTAA 
CTTCCGGCTG 
ATCATTGCAG 
GGGAGTCAGG 
ATTAAGCATT 
CTTCATTTTT 
ATCCCTTAAC 
GGCGAATGGC 
TGCGATCTTC 
GATGCGCCCA 
ACGGAGAATC 
GAAGGCCAGA 
AACAAAAATT 
ATACAATCTT 
TGCTAGTTTT 
ACCTGATAGC 
CTAGAACGGT 
TTGAATCTTT 
ATTTTTATCC 
TTTTTGGTAC 
CTTTGCCTTG 



ACTCAGGCAA 
GATGGACAGA 
GGCGTACCGT 
TCCAACGAGG 

r*r*r*r*r>r* n mm »i 

CGCCCTAGCG 
TCCCCGTCAA 
CCTCGACCCC 
GACGGTTTTT 
AACTGGAACA 
GATTTCGGAA 
TTGCTGCAAC 
GTGAAAAGAA 
GCGGAACCCC 
AATAACCCTG 
TCCGTGTCGC 
AAACGCTGGT 
AACTGGATCT 
TGATGAGCAC 
AAGAGCAACT 
TCACAGAAAA 
CCATGAGTGA 
TAACCGCTTT 
AGCTGAATGA 
CAACGTTGCG 
TAGACTGGAT 
GCTGGTTTAT 
CACTGGGGCC 
CAACTATGGA 
GGTAACTGTC 
AATTTAAAAG 
GTGAGTTTTC 
GCTTTGCCTG 
CTGAGGCCGA 
TCTACACCAA 
CGACGGGTTG 
CGCGAATTAT 
TAACGCGAAT 
CCTGTTTTTG 
ACGATTACCG 
CTTTGTAGAT 
TGAATATCAT 
ACCTACACAT 
TTGCGTTGAA 
AACCGATTTA 
CCTGTATGAT 



GTGATGTTAT 
CTCTTTTACT 
TCCTGTCTAA 
AAAGCACGTT 

rvv»v^ o vj vaunts 

CCCGCTCCTT 
GCTCTAAATC 
AAAAAACTTG 
CGCCCTTTGA 
ACACTCAACC 
CCACCATCAA 
TCTCTCAGGG 
AAACCACCCT 
TATTTGTTTA 
ATAAATGCTT 
CCTTATTCCC 
GAAAGTAAAA 
CAACAGCGGT 
TTTTAAAGTT 
CGGTCGCCGG 
GCATCTTACG 
TAACACTGCG 
TTTGCACAAC 
AGCCATACCA 
CAAACTATTA 
GGAGGCGGAT 
TGCTGATAAA 
AGATGGTAAG 
TGAACGAAAT 
AGACCAAGTT 
GATCTAGGTG 
GTTCCACTGT 
GTTTCCGGCA 
TACTGTCGTC 
CGTAACCTAT 
TTACTCGCTC 
TTTTGATGGC 
TTTAACAAAA 
GGGCTTTTCT 
TTCATCGATT 
CTCTCAAAAA 
ATTGATGGTG 
TACTCAGGCA 
ATAAAGGCTT 
GCTTTATGCT 
TTATTGGATG 



TACTAATCAA 

CGGTGGCCTC 

AATCCCTTTA 

ATACGTGCTC 

GTGTGGTGGT 

TCGCTTTCTT 

GGGGGCTCCC 

ATTTGGGTGA 

CGTTGGAGTC 

CTATCTCGGG 

ACAGGATTTT 

CCAGGCGGTG 

GGATCCAAGC 

TTTTTCTAAA 

CAATAATATT 

TTTTTTGCGG 

GATGCTGAAG 

AAGATCCTTG 

CTGCTATGTC 

GCGCGGTATT 

GATGGCATGA 

GCCAACTTAC 

ATGGGGGATC 

AACGACGAGC 

ACTGGCGAAC 

AAAGTTGCAG 

TCTGGAGCCG 

CCCTCCCGTA 

AGACAGATCG 

TACTCATATA 

AAGATCCTTT 

ACGTAAGACC 

CCAGAAGCGG 

GTCCCCTCAA 

CCCATTACGG 

ACATTTAATG 

GTTCCTATTG 

TATTAACGTT 

GATTATCAAC 

CTCTTGTTTG 

TAGCTACCCT 

ATTTGACTGT 

TTGCATTTAA 

CTCCCGCAAA 

CTGAGGCTTT 

TT 
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Table 25: h340 1 -h2 captured Via CJ with Bsm AI 
! 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 
ISAQDIQMTQSPATLS 
aGT GCA Caa gac ate cag atg acc cag tct cca gec acc ctg tct 
5 ! ApaLI... agccacc!U5,L6,L20,L2,L16,All 
! Extender........... .Bridge... 

! 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 
JVSPGERATLSCRASQ 
1 0 gtg tct cca ggg gaa agg gec acc etc tec tgc agg gec agt cag 

! 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 
fSVSNNLAWY. QQKPGQ 
agt gtt agt aac aac tta gec tgg tac cag cag aaa cct ggc cag 

! 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 
IVPRLLIYGASTRATD 
gtt ccc agg etc etc ate tat ggt gca tec acc agg gec act gat 

20 ! 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 
IIPARFSGSGSGTDFT 
ate cca gee agg ttc agt ggc agt ggg tct ggg aca gac ttc act 

! 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 
25 ILTISRLEPEDFAVYY 

etc acc ate age aga ctg gag cct gaa gat ttt gca gtg tat tac 

! 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 
l-CQRYGSSPGWTFGQG 
30 tgt cag egg tat ggt age tea ccg ggg tgg acg ttc ggc caa ggg 

! 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 
!TKVEIKRTVAAPSVF 
acc aag gtg gaa ate aaa cga act gtg get gca cca tct gtc ttc 

35 

! 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 
UFPPSDEQLKSGTAS 
ate ttc ccg cca tct gat gag cag ttg aaa tct gga act gec tct 

40 ! 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 
IVVCLLNNFYPREAKV 
gtt gtg tgc ctg ctg aat aac ttc tat ccc aga gag gee aaa gta 

! 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 
45 !QWKVDNALQSGNSQE 

cag tgg aag gtg gat aac gee etc caa teg ggt aac tec cag gag 

! 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 
!SVTEQDSKDSTYSLS 
5 0 agt gtc aca gag cag gac age aag gac age acc tac age etc age 
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! 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 
ISTLTLSKADYEKHKV 
age acc ctg acg ctg age aaa gca gac tac gag aaa cac aaa gtc 

5 ! 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 
IYACEVTHQGLSSPVT 
tac gee tgc gaa gtc acc cat cag ggc ctg age teg cct gtc aca 

1211212 213 214 215 216 217 218 219 220 221 222 223 
10 IKSFNKG ECKGEFA 

aag age ttc aac aaa gga gag tgt aag ggc gaa ttc gc 
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Table 26: h340 !-d8 KAPPA captured with CJ and BsmAl 

I 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 
ISAQDIQMTQSPATLS 
5 aGTGCACaa gac ate cag atg ace cag tct cct gec ace ctg tct 

! ApaLL..Extender j gee ace ! L25,L6,L20,L2,L16,A11 

! A GCC ACC CTG TCT ! L2 

! 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 
10 !VSPGERATLSCRASQ 

gtg tct cca ggt gaa aga gec acc etc tec tgc agg gee agt cag 
! GTG TCT CCA GGG GAA AGA GCC ACC CTC TCC TGC ! L2 

! 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 
15 !NLLSNLAWYQQKPGQ 

aat ctt etc age aac tta gee tgg tac cag cag aaa cct ggc cag 

! 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 
IAPRLLIYGASTGAIG 
2 0 get ccc agg etc etc ate tat ggt get tec acc ggg gee att ggt 

! 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 
IIPARFSGSGSGTEFT 
ate cca gee agg ttc agt ggc agt ggg tct ggg aca gag ttc act 

25 

! 76 77 78 79 80 81 82 83 84 85 86 87 88 89.90 
1LTISSLQSEDFAVYF 
etc acc ate age age ctg cag tct gaa gat ttt gca gtg tat ttc 

30 ! 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 
1CQQYGTSPPTFGGGT 
tgt cag cag tat ggt acc tea ccg ccc act ttc ggc gga ggg acc 

1 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 
35 IKVEIKRTVAAPSVFI 

aag gtg gag ate aaa cga act gtg get gca cca tct gtc ttc ate 

! 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 
IFPPSDEQLKSGTASV 
40 ttc ccg cca tct gat gag cag ttg aaa tct gga act gee tct gtt 

! 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 
IVCPLNNFYPREAKVQ 
gtg tgc ccg ctg aat aac ttc tat ccc aga gag gee aaa gta cag 

45 

! 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 
IWKVDNALQSGNSQES 
tgg aag gtg gat aac gee etc caa teg ggt aac tec cag gag agt 

50 ! 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 
IVTEQDNKDSTYSLSS 
gtc aca gag cag gac aac aag gac age acc tac age etc age age 



WO 02/083872 



PCT/US02/12405 



- 171 - 



! 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 
!TLTLSKVDYEKHEVY 
acc ctg acg ctg age aaa gta gac tac gag aaa cac gaa gtc tac 

5 ! 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 
!ACEVTHQGLSSPVTK 
gec tgc gaa gtc acc cat cag ggc ctt age teg ecc gtc acg aag 

! 21 1 212 213 214 215 216 217 218 219 220 221 222 223 
10 !SFNRGECKKEFV 

age ttc aac agg gga gag tgt aag aaa gaa ttc gtt t 
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Table 27: V3-23 VH framework with variegated codons shown 

! 

! 17 18 19 20 21 22 

! A Q P A M A 

j' -ctg let gaa cG GCC cag ccG GCC a eg gec 29 
3*-gac aga ctt gc egg gtc ggc egg tac egg 

Scab Sfil 

NgoMI... 
NcoL... 

FRl(DP47/V3-23) 

23 24 25 26 27 28 29 30 
EVQLLESG 
gaa|gtt|CAA|TTG|tta|gag|tct|ggt| 53 

1 5 ! ctt|caafgtt|aac|aat|ctc|aga|cca| 
! | Mfel | 



10 



20 



25 



30 



35 



45 



! 

40 ! 



-FR1- 



31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 

GGLVQPGGSLRLSCA 
|ggclggtlcttlettl cag|cct|ggt|ggtltctlttal cgtlcttitctltgclgctl 98 
|ccg|cca|gaajeaa|gtc|gga|cca|cca|agalaat|gca|gaa|aga|acg|cga| 

Sites to be varied— > **• **♦ 

_FR1 >|...CDR1 |— FR2 

46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 

ASGFTFSSYAMSWVR 
tgctiTCCIGGAIttclactlttcl tctltCGITACIGetlatgltct itgglgttlcgCl 143 
|cga|agg|cct|aag|tga|aag|aga|agc|atgjcga|tac)ag4cc|caa|gcg| 
|BspEI| | BsiWI| |BstXI. 

Sites to be varies--> *••♦♦♦ 

FR2 >|...CDR2 

61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 
QAPGKGLEWVSAISG 
ICAalgctlccTlGGti aaa lggtlttg|gagltgglgttltct| g ctlatcltctlggH 1 88 
|gtt|cga|gga|cca|m|cca|aac|ctc|acc|caa|aga;cga|tag|aga|cca| 
BstXI I 



j 

! ...BstXI 



! CDR2 1— FR3— 

! 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 
! SGGSTYYADSVKGRF 

ltctiggt|ggclagt|act|tac|ta t|gct|gacltcc|gttlaaalggt tcgclttcl 233 
|agajcca|ccg|tca|tga|atg|ata|cga|etg|agg|caa|tttlcca|gcg|aag| 



-FR3- 



91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 
TISRDNSKNTLYLQM 
5 0 |act|atc|TCT|AGA|gac|aac|tct|aag|aat|act|ctcltac|ttg|cag|atgt 278 

! |tga|tag|aga|tct|ctg|ttg|aga|ttc|tta|tga|gag|atg|aac|gtc|tac| 
|XbaI | 



I 
i 

! — FR3- 



55 ! 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 
1 NSLRAEDTAVYYCAK 

laacl agCiTTAlAGglgctlgaglgflctaCTI GCAIGtcttacltfltltpplpptlaflal 323 
! |ttg|tcg|aat|tcc|cga|ctc|ctg|tga|cgt|cag|atg|ata|acg|cga|ttt| 
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20 



40 



50 



| Aflll | | Psti | 



XDR3 | — FR4- 



121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 
DYEGTGYAFDIWGQG 
|gacltatlgaalggt1act<ggtitat lgctlttc|gaCIATAtTGg|ggt[c aaJggt| 368 
! |ctgjata|ctt|cca|tga|cca)atatga|aag|ctg|tat|acc|ccfl|gtt|cca| 
! | Ndel | 

i 



10 ! FR4- 



! 136 137 138 139 140 141 142 
! T M V T V S S 

|act|atG|GTqACC|gtc|tct|agt- 389 
! |tga|tac|cag|tgg|cag|aga|tca- 
15 ! | BstEII | 



143 144 145 146 147 148 149 150 151 152 
ASTKGPSVFP 
gcc tec acc aaG GGC CCa teg GTC TTC ccc-3' 419 
egg agg t ee ttc ccg get age cag aag ggg-5' 

Bspl20L BbsI...(2/2) 

Apal.„. 



(SFPRMET) 5'-ctg tct gaa cG GCC cag ccG-3' 
(TOPFRt A) 5*-ctg tct gaa cG GCC cag ccG GCC atg gec- 
25 gaa|gtt|CAA|TTG|tta|gag|tctlggt|- 

|ggc|ggt|ctt|gtt|cag|cct|ggt|ggt|tct|tta-3 , 
(BOTFR1 B) 3*-caa|gtc|gga|cca|cca|aga|aat|gca|gaa|aga|acg|cga|- 

|cga|agg|cct|aag|tga|aag-5 , ! bottom strand 
(BOTFR2) 3'-acc|caa|gcgI- 
30 |gtt|cga|gga|cca|ttt|cca|aac|ctc|aectcaa|aga|-5 f ! bottom strand 

(BOTFR3) 3'- a|cga|ctg|agg|caa|ttt|cca|gcg|aag|- 

|tga|tag|aga|tct|ctg|ttg|aga|rtc|tta|tga|gag|atg|aac|gtc|tac|- 
|ttg|tcg|aat|tcc|cga|ctc|ctg|tga-5* 
(F06) S'-gCITTAIAGglgctlgaglgaclaCTIGCAIGtcltacltatltgclgctlaaal- 
3 5 |gacltat|gaa]ggt|act|ggt|tatlgct|ttc|gaqATA|TGg|ggtlc-3 , 
(BOTFR4) S'-cgalaaglctgltatlacclccalgttlccal- 
|tga|tac|cag|tgg|cag|aga|tca- 
egg agg tgg ttc ccg ggt age cag aag ggg-5* ! bottom strand 
(BOTPRCPRIM) 3'-gg ttc ccg ggt age cag aag ggg-5 f 



CDR1 diversity 



(ON-vgCl) 5 , -| g ctiTCCIGGAlttc<actlttcltctl<l>ITAa<l>latg|<1>4 

! CDR1 6859 

45 ltgglgttlcgCICAalgctlccTICG -3' 



<1> stands for an equimolar mix of {ADEFGHIKLMNPQRSTVWY}; no C 
(this is not a sequence) 

CDR2 diversity 



(ON-vgC2) 5 , -ggt|ttgtgagItgg|gtt|tct|<2>|atc|<2>|<5>l- 

! CDR2 

|tctjggt|ggc|< l>|actj<:l >|tat|gct|gac|tccIgtt|aaa|gg-3* 

55 ! CDR2 

! <1> is an equimolar mixture of {ADEFGHIKLMNPQRSTVWY}; no C 
! <2> is an equimolar mixture of { YRWVGS}; no ACDEFHIKLMNPQT 
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! <3> is an equimolar mixture of {PS}; no ACDEFGHIKLMNQRTVWY 
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Table 30: Oligonucleotides used to clone CDR1/2 diversity 
All sequences are 5' to 3\ 
l)ON_CDlBsp,30 bases 

5 

AccTcAcTggcTTccggA 
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 

TTcAcTTTcTcT 
10 19 20 21 22 23 24 25 26 27 28 29 30 



2)ON_Brl2,42 bases 

AgAA AcccAcTccA AAcc 
15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 



TTTAcc AggAgcTTggcg 

19 20 21 22 23 24 25 26 27 28 29 30 3 1 32 33 34 35 36 

20 A A c c c A 
37 38 39 40 4142 



3)ON_CD2Xba,51 bases 

25 ggAAggcAgTgATcTAgA 
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 



g ATAgTg A AgcgAccTTT 

19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 

30 

AAcggAgTcAgcATA 

37 38 39 40 41 42 43 44 45 46 47 48 49 50 5 1 

35 4) ON_BotXba, 23 bases 

ggAAggcAgTgATcTAgA 
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 

40 g A T A g 
19 20 2122 23 
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Table 31: Bridge/Extender Oligonucleotides 



ON_LamlaB7 (rc) 
ON_Lam2aB7 (rc) 
ON_Lam31B7 (rc) 
5 ON_Lam3rB7 (rc) 
ON_LamHf lcBrg (rc) 
ON_LamHflcExt 
ON_LamHf2b2Brg (rc) 
ON_LamHf2b2Ext 

1 0 ON_LamHf 2dBrg { rc ) 
0N_LamHf2dExt . 
ON_LamHf31Brg (rc) 
ON_LamHf31Ext 
ON_LamHf 3rBrg (rc) 

15 ON_LamHf3rExt 
ON_lamPlePCR 
Consensus 



GTGCTGACTCAGCCACCCTC. 2& 

GCCCTGACTCAGCCTGCCTC . 20 

GAGCTGACTCAGG . ACCCTGC 2 0 

GAGCTGACTCAGCCACCCTC . 2 0 

CCTCGACAGCGAAGTGCACAGAGCGTCTTGACTCAGCC 38 

CCTCGACAGCGAAGTGCACAGAGCGTCTTG 30 

CCTCGACAGCGAAGTGCACAGAGCGCTTTGACTCAGCC 38 

CCTCGACAGCGAAGTGCACAG AGCGCTTTG 30 

CCT CG AC AGCT AAGTGCAC AGAGCG CT TTGACT CAGCC 38 

CCTCGACAGCGAAGTGCACAGAGCGCTTTG 30 

CCTCGAC AGCGAAGTGCACAGAGCGAATTG ACTCAGCC 38 

CCT CGACAGCGAAGTGCACAGAGCGAATTG 30 

CCT CG AC AGCGAAG TGC ACAGT ACG AATTGACT CAGCC 38 

CCTCGACAGCGAAGTGCACAGTACGAATTG 30 

CCTCGACAGCGAAGTGCACAG 21 
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Table 32: Oligonucleotides used to make SSDNA locally 
double-stranded 

IT V — f 

H43HF3.1?02#1 5'-cc gtg tat tac tgt gcg aga g-3' 
H43.77.97.1-03#2 5 ' -ct gtg tat tac tgt gcg aga g-3' 
H43. 77. 97.323#22 5'-CC gt| tat tac tgt gcg aga g-3' 
H43.77.97.330#23 5'-c| gtg tat tac tgt gcg afla g-3' 
H43. 77. 97. 439144 5'-c| gtg tat tac tgt gcg aga |-3' 
H43.77.97.551#48 5 ' -cc |tg tat tac tgt gcg aga |-3' 
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Table 33: Bridge/extender pairs 

Bridges (2) — . 

H43.XABrl 

5 5 1 ggtgtagtgaTCTAGtgacaactctaagaatactctctacttgcagatgaacagCTTtAGgg 
ctgaggacaCTGCAGtctactattgtgcgaga-3 1 

H43.XABr2 

S'ggtgtagtgaTCTAGtgacaactctaagaatactctctacttgcagatgaacagCTTtAGgg 
10 ctgaggacaCTGCAGtctactattgtgcgaaa-3 1 

Extender 
H4 3.XAExt 

5 1 ATAgTAgAcTgcAgTgTccTcAgcccTTAAgcTgTTcATcTgcAAgTAgAgAgTATTcTTAg 
15 AgTTgTcTcTAgATcAcTAcAcc-3 1 
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Table 34: PCR primers 
Primers 

UA *3 V 7\ DPD O . ^4- wfn _m-r\ _m m _m-»i _ 

Hucmnest cttttctttgttgccgttggggtg 
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Table 35: PCR program for amplification of 
heavy chain CDR3 DNA 



95 degrees C 5 minutes 



95 degrees C 
60 degrees C 
72 degrees C 

72 degrees C 
4 degrees C 



20 seconds 
30 seconds 
1 minute 

7 minutes 
hold 



repeat 20x 



Reagents (100 ul reaction): 



10 Template 

lOx PCR buffer 

Taq 

dNTPs 

MgC12 

15 H43.XAPCR2-biotin 
Hucmnest 



5ul ligation mix 

lx 

5U 

200 uM each 
2mM 

400 nM 
200 nM 
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Table 36: Annotated sequence of CJR DY3F7 (CJR-A05) 10251 bases 
Non-cutters 



DLll J. ydLUd 

BstZ17I GTAtac 
Fsel GGCCGGcc 
Pmel GTTTaaac 
RsrII CGgwccg 
Sgfl GCGATcgc 
StuI AGGcct 

cutters 



dsiWI Cgtacg 
Btrl CACgtg 
Hpal GTTaac 
Pmll CACgtg 
Sapl GCTCTTC 
SgrAI CRccggyg 
Xmal Cccggg 



BssSI Cacgag 
EcoRV GATatc 
Mlul Acgcgt 
PpuMI RGgwccy 
SexAI Accwggt 
SphI GCATGc 



Enzymes that cut from 1 to 4 times and other features 



! End of genes II and X 




829 




! Start gene V 




843 




IBsrGI Tgtaca 


I 


1021 




! BspMI Nnnnnnnnngcaggt 


3 


1104 


5997 


J-"- ACCTGCNNNNn 


1 


2281 




! End of gene V 




1106 




! Start gene VII 




1108 




IBsaBI GATNNnnatc 


2 


1149 


3967 


! Start gene IX 




1208 




!End gene VII 




1211 




!SnaBI TACgta 


2 


1268 


7133 
i ± j j 


! BspHI Tcatga 


3 


J. £- J Zf 


OvOJ 


(Start gene VIII 




1301 




!End gene IX 




1304 




!End gene VIII 




1522 




.'Start gene III 




1578 




!EagI Cggccg 


2 


1630 


8905 


IXbal Tctaga 


2 


1643 


8436 


!KasI Ggcgcc 


4 


1650 


8724 


!BsmI GAATGCN 


2 


1769 


9065 


IBseRI GAGGAGNNNNNNNNNN 


2 


2031 


8516 


!-"- NNnnnnnnnnctcctc 


2 


7603 


8623 


!AlwNI CAGNNNctg 


3 


2210 


8072 


IBspDI ATcgat 


2 


2520 


9883 


!NdeI CAtatg 


3 


2716 


3796 


!End gene III 




2846 




! Start gene VI 




2848 




'Afel AGCgct 


1 


3032 




End gene VI 




3187 




Start gene I 




3189 




Earl CTCTTCNnnn 


2 


4067 


9274 


- w - Nnnnngaagag 


2 


6126 


8953 


Pad TTAATtaa 


1 


4125 




Start gene IV 




4213 




End gene I 




4235 




BsmFI Nnnnnnnnnnnnnnngtccc 


2 


5068 


9515 


MscI TGGcca 


3 


5073 


7597 


Psil TTAtaa 


2 


5349 


5837 


End gene IV 




5493 




Start ori 




5494 




NgoMIV Gccggc 


3 


5606 


8213 


Banll GRGCYc 


4 


5636 


8080 


Drain CACNNNgtg 


1 


5709 




DrdI GACNNNNnngtc 


1 


5752 




Aval Cycgrg 


2 


5818 


7240 



9039 9120 



9160 
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!PvuII CAGctg 1 5953 

!BsmBI CGTCTCNnnnn 3 5964 8585 9271 

!End ori region 5993 

■BamHI Ggatcc 1 5994 

5 IHindlll Aagctt 3 6000 7147 7384 

IBciVI GTATCCNNNNNN 1 6077 

•Start bla 6138 

!Eco57I CTGAAG 2 6238 7716 

!SpeI Actagt 1 6257 

10 IBcgl gcannnnnntcg 1 6398 

.'Seal AGTact 1 6442 

IPvuI CGATcg 1 6553 

•Fspl TGCgca 1 6700 

!BglI GCCNNNNnggc 3 6801 8208 8976 

15 !BsaI GGTCTCNnnnn 1 6853 

!AhdI GACNNNnngtc 1 6920 

!Eamll05I GACNNNnngtc 1 6920 

'End bla 6998 

!AccI GTmkac 2 7153 8048 

20 'Hindi GTYrac 1 7153 

!SalI Gtcgac 1 7153 

!XhoI Ctcgag 1 7240 

! Start PlacZ region 724 6 

!End PlacZ region 7381 

25 !PflMI CCANNNNntgg 1 7382 

!RBS1 7405 

! start M13-iii signal seq for LC 7418 

!ApaLI Gtgcac 1 7470 

!end M13-iii signal seq 7471 

30 ! Start light chain kappa L20:JK1 7472 

IPflFI GACNnngtc 3 7489 8705 9099 

•Sbfl CCTGCAgg 1 7542 

!PstI CTGCAg 1 7543 

IKpnl GGTACc 1 7581 

35 !XcmI CCANNNNNnnnntgg 2 7585 9215 

•Nsil ATGCAt * 2 7626 9503 

!BsgI ctgeae 1 7809 

!BbsI gtcttc 2 7820 8616 

!BlpI GCtnagc 1 8017 

40 !EspI GCtnagc 1 8017 

!EcoO109I RGgnccy 2 8073 8605 

!Ecll36I GAGctc 1 8080 

!SacI GAGCTc 1 8080 

!End light chain 8122 

45 !AscI GGcgcgcc 1 8126 

IBssHII Gcgcgc 1 8127 

!RBS2 ' 8147 

•Sfil GGCCNNNNnggcc 1 8207 

!NcoI Ccatgg 1 8218 

50 ! Start 3-23, FR1 8226 

!MfeI Caattg 1 8232 

IBspEI Tccgga 1 8298 

! Start CDR1 8316 

IStatt FR2 8331 

55 '.BstXI CCANNNNNntgg 2 8339 8812 

SEcoNI CCTNNnnnagg 2 834 6 8675 

! Start FR3 8373 

•Xbal Tctaga 2 8436 1643 

!AflII Cttaag 1 8480 

60 ! Start CDR3 8520 

!AatII GACGTc 1 8556 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



! Start FR4 




ft f^fiO 




! PshAI GACNNnngtc 


o 
z 


oO / J 


y^ji 


JBstEII Ggtnacc 


i 

j. 






! Start CHI 








I Apa I GGGCCc 


i 

X 


ft fiflfi 




!Bspl20I Gggccc 


1 

X 


ODUD 




! PspOMI Gggccc 


1 
1 


ft 606 




!AgeI Accggt 


1 






!Bsu36I CCtnagg 


o 


O / r v 




!End of CHI 




O JUO 




!NotI GCggccgc 


X 


ft Qfid 




! Start His6 tag 




ftQ1 ^ 




' Start cMvc tacr 








i Amber codon 








! Nhel Gctagc 


X 


ft QQ R 




! Start M13 III Domain 3 




8997 




! Nrul TCGcga 


1 


9106 




IBstBI TTcgaa 


1 


9197 




IEcoRI Gaattc 


1 


9200 




!XcmI CCANNNNNnnnntgg 


1 


9215 




!BstAPI GCANNNNntgc 


1 


9337 




!SacII CCGCgg 


1 


9365 




! End Illstump anchor 




9455 




!AvrII Cctagg 


1 


9462 




Itrp terminator 




9470 




!SwaI ATTTaaat 


1 


9784 




.'Start gene II 




9850 




!BglII Agatct 


1 


9936 





1 aat get act act 
gene ii continued 
4 9 cca aat gaa aat 
97 tct aat ggt caa 
14 5 gtt aTa tgg aat 
193 aaa cat gtt gag 
241 tec gca aaa atg 
289 aat cct gac ctg 
337 cga att aaa acg 
385 ttt gat gca ate 
433 ctg att ttt gat 
481 ttt gag ggg gat 

52 9 get ate cag tct 
577 ttt gca aaa gee 
625 gag ggt tat gat 
673 cgt tat gta tct 
721 atg aat ctt tct 
769 aac gta gat ttt 
817 ctt aaa ate gca 

832 ggtaattca ca 



att agt aga att gat gec acc ttt tea get cgc gee 



ata get aaa cag gtt att gac cat 
act aaa tct act cgt teg cag aat 
gaa act tec aga cac cgt act tta 
eta cag caT TaT att cag caa tta 
acc tct tat caa aag gag caa tta 
ttg gag ttt get tec ggt ctg gtt 
cga tat ttg aag tct ttc ggg ctt 
cgc ttt get tct gac tat aat agt 
tta tgg tea ttc teg ttt tct gaa 
tea ATG aat att tat gac gat tec 

Start gene x, ii continues 
aaa cat ttt act att acc ccc tct 
tct cgc tat ttt ggt ttt tat cgt 
agt gtt get ctt act atg cct cgt 
gca tta gtt gaa tgt ggt att cct 
acc tgt aat aat gtt gtt ccg tta 
tct tec caa cgt cct gac tgg tat 
TAA 

End X & II 



ttg cga 
tgg gaa 
gtt gca 
age tct 
aag gta 
cgc ttt 
cct ctt 
cag ggt 
ctg ttt 
gca gta 

ggc aaa 
cgt ctg 
aat tec 
aaa tct 
gtt cgt 
aat gag 



aat 
tea 
tat 
aag 
etc 
gaa 
aat 
aaa 
aaa 
ttg 



gta 
act 
tta 
cca 
tct 
get 
ctt 
gac 
gca 
gac 



act tct 
gta aac 
ttt tgg 
caa ctg 
ttt att 
cca gtt 



M l E5 Q10 T15 

843 ATG att aaa gtt gaa att aaa cca tct caa gee caa ttt act act cgt 
Start gene V 

S17 S20 P25 E30 

891 tct ggt gtt tct cgt cag ggc aag cct tat tea ctg aat gag cag ctt 

V35 E40 V45 

939 tgt tac gtt gat ttg ggt aat gaa tat ccg gtt ctt gtc aag att act 
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D50 A55 L60 

987 ctt gat gaa ggt cag cca gcc tat gcg cct ggt cTG TAC Acc gtt cat 

BsrGI . . . 

L65 V70 S75 — - R80 

1035 ctg tec tct ttc aaa gtt ggt cag ttc ggt tec ctt atg att gac cgt 

P85 K87 end of V 

1083 ctg cgc etc gtt ccg get aag TAA C 

1108 ATG gag cag gtc gcg gat ttc gac aca att tat cag gcg atg 
Start gene VII 

1150 ata caa ate tec gtt gta ctt tgt ttc gcg ctt ggt ata ate 

VII and IX overlap. 

S2 V3 L4 V5 S10 

1192 get ggg ggt caa agA TGA gt gtt tta gtg tat tct ttT gcc tct ttc gtt 

End VII 
I start IX 

L13 W15 G20 T25 E29 

1242 tta ggt tgg tgc ctt cgt agt ggc att acg tat ttt acc cgt tta atg gaa 

1293 act tec tc 

stop of IX, IX and VIII overlap by four bases 

1301 ATG aaa aag tct tta gtc etc aaa gcc tct gta gcc gtt get acc etc 
Start signal sequence of viii. 

1349 gtt ccg atg ctg tct ttc get get gag ggt gac gat ccc gca aaa gcg 

mature VIII > 

1397 gcc ttt aac tec ctg caa gcc tea gcg acc gaa tat ate ggt tat gcg 

1445 tgg gcg atg gtt gtt gtc att 

14 66 gtc ggc gca act ate ggt ate aag ctg ttt aag 

bases 14 99-1539 are probable promoter for iii 
1499 aaa ttc acc teg aaa gca ! 1515 
-35 

1517 age tga taaaccgat acaattaaag gctccttttg 

-10 

1552 gagecttttt ttt GGAGAt ttt ! S.D. uppercase, there may be 9 Ts 

< III signal sequence > 

MKKLLFAI PLVVPF 
1574 caac GTG aaa aaa tta tta ttc gca att cct tta gtt gtt cct ttc ! 1620 

YSGAAESHLDGA 
1620 tat tct ggc gCG GCC Gaa tea caT CTA GAc ggc gcc 
EagI .... Xbal. . . . 



Domain 1 

AETVESCLA 

1656 get gaa act gtt gaa agt tgt tta gca 

KSHTEISFTNVWKDDKT 
1683 aaA Tec cat aca gaa aat tea ttt aCT AAC GTC TGG AAA GAC GAC AAA ACt 

LDRYANYEGSLWNATGV 
1734 tta gat cgt tac get aac tat gag ggC tgt ctg tgG AAT GCt aca ggc gtt 



10 



25 



35 
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BsmI . . . . 

VVCTGDETQCYGTWVPI 
1785 gta gtt tgt act ggt GAC GAA ACT CAG TGT TAC GGT ACA TGG GTT cct att 

G L A I PEN 
1836 ggg ctt get ate cct gaa aat 

LI linker 

EGGGSEGGGS 
1857 gag ggt ggt ggc tct gag ggt ggc ggt tct 

! EGGGSEGGGT 
1887 gag ggt ggc ggt tct gag ggt ggc ggt act 

15! 

! Domain 2 

1917 aaa cct cct gag tac ggt gat aca cct att ccg ggc tat act tat ate aac 

1968 cct etc gac ggc act tat ccg cct ggt act gag caa aac ccc get aat cct 

2019 aat cct tct ctt GAG GAG tct cag cct ctt aat act ttc atg ttt cag aat 
20 ! BseRI . . 

2070 aat agg ttc cga aat agg cag ggg gca tta act gtt tat acg ggc act 

2118 gtt act caa ggc act gac ccc gtt aaa act tat tac cag tac act cct 

2166 gta tea tea aaa gee atg tat gac get tac tgg aac ggt aaa ttc AGA 

AlwNI 

2214 GAC TGc get ttc cat tct ggc ttt aat gaG gat TTa ttT gtt tgt gaa 

AlwNI " ~ " 

2262 tat caa ggc caa teg tct gac ctg cct caa cct cct gtc aat get 



2307 ggc ggc ggc tct 

30 ! start L2 . 

2319 ggt ggt ggt tct 
2331 ggt ggc ggc tct 

2343 gag ggt ggt ggc tct gag gga ggc ggt tec 
2373 ggt ggt ggc tct ggt ! end L2 



Many published sequences of M13-derived phage have a longer linker 
than shown here by repeats of the EGGGS motif two more times. 



Domain 3 



40 


2388 

i 


S 
tec 


G 

ggt 


D 

gat 


F 
ttt 


D 
gat 


Y 

tat 


E 
gaa 


K 
aag 


M 
atg 


A 
gca 


N 

aac 


A 
get 


N 
aat 


K 
aag 


G 

ggg 


A 
get 


45 


j 

2436 

i 


M 
atg 


T 
ace 


E 
gaa 


N 
aat 


A 
gee 


D 
gat 


E 
gaa 


N 
aac 


A 
gcg 


L 
eta 


Q 
cag 


S 
tct 


D 
gac 


A 
get 


K 
aaa 


G 
ggc 




2484 

i 


K 
aaa 


L 
ctt 


D 
gat 


S 
tct 


V 
gtc 


A 
get 


T 
act 


D 
gat 


Y 
tac 


G 
ggt 


A 
get 


A 
get 


M 
ate 


D 
gat 


G 
ggt 


F 
ttc 


50 


t 

2532 

i 


I 

att 


G 
ggt 


D 
gac 


V 
gtt 


S 
tec 


G 
ggc 


L 
ctt 


A 
get 


N 
aat 


G 
ggt 


N 
aat 


G 
ggt 


A 
get 


T 
act 


G 
ggt 


D 
gat 




i 

2580 

j 


F 
ttt 


A 
get 


G 
ggc 


S 
tct 


N 
aat 


S 
tec 


Q 
caa 


M 
atg 


A 
get 


Q 
caa 


V 

gtc 


G 
ggt 


D 
gac 


G 
ggt 


D 

gat 


N 
aat 


55 


i 

2628 

i 


S 
tea 


P 
cct 


L 
tta 


M 

atg 


N 
aat 


N 
aat 


F 
ttc 


R 

cgt 


Q 
caa 


Y 
tat 


L 
tta 


P 
cct 


S 
tec 


L 

etc 


P 
cct 


Q 
caa 


60 


i 

2676 

j 


S 
teg 


V 
gtt 


E 
gaa 


C 
tgt 


R 
cgc 


P 

cct 


F 
ttt 


V 
gtc 


F 
ttt 


G 
Ggc 


A 
get 


G 
ggt 


K 
aaa 


P 
cca 


Y 
tat 


E 
gaa 


i 


F 


S 


I 


D 


C 


D 


K 


I 


N 


L 


F 


R 
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10 



15 



20 



25 



2724 


ttt 


tct att 


gat 


tgt 


gac 


aaa 


ata 


aac 


tta 


ttc cgt 


































Domain 


3 






G 


V F 


A 


F 


L 


L 


Y 


V 


A 


T 


F 


M 


Y 


v 


F14 0 


2760 


ggt 


gtc ttt 


gcg 


ttt 


ctt 


tta 


tat 


gtt 


gee 


acc 


ttt 


atg 


tat 


gta 


ttt 




start transmembrane 


segment 






















S 


T F 


A 


N 


I 


L 




















2808 


tct 


acg ttt 


get 


aac 


ata 


ctg 






















R 


N K 


E 


S 
























2829 


cgt 


aat aag 


gag 


tct 


TAA 


! stop of iii 














Intracellular anchor 




























Ml P2 


V 


L 


L5 


G 


I 


P 


L 


L10 


L 


R 


F 


L 


G15 


2847 


tc 


ATG cca 


gtt 


ctt 


ttg 


ggt 


att 


ccg 


tta 


tta 


ttg 


cgt 


ttc 


etc 


ggt 






Start VI 


























2894 


ttc 


ctt ctg 


gta 


act 


ttg 


ttc 


ggc 


tat 


ctg 


ctt 


act 


ttt 


ctt 


aaa 


aag 


2942 


ggc 


ttc ggt 


aag 


ata 


get 


att 


get 


att 


tea 


ttg 


ttt 


ctt 


get 


ctt 


att 


2990 


att 


ggg ctt 


aac 


tea 


att 


ctt 


gtg 


ggt 


tat 


etc 


tct 


gat 


att 


age 


get 


3038 


caa 


tta ccc 


tct 


gac 


ttt 


gtt 


cag 


ggt 


gtt 


cag 


tta 


att 


etc 


ccg 


tct 


3086 


aat 


gcg ctt 


ccc 


tgt 


ttt 


tat 


gtt 


att 


etc 


tct 


gta 


aag 


get 


get 


att 


3134 


ttc 


att ttt 


gac 


gtt 


aaa 


caa 


aaa 


ate 


gtt 


tct 


tat 


ttg 


gat 


tgg 


gat 



Ml A2 V3 F5 
3182 aaa TAA t ATG get gtt tat ttt gt 
end VI Start gene I 



L10 G13 
a act ggc aaa tta ggc tct gga 



30 


f 


3228 


K 
aag 


T 
acg 


L 
etc 


V 

gtt 


S 
age 


V 
gtt 


G 
ggt 


K 
aag 


I 

att 


Q 
cag 


D 

gat 


K 
aaa 


I 

att 


V 

gta 


A 

get 


35 


f 
1 


3273 


G 

ggg 


C 
tgc 


K 
aaa 


I 

ata 


A 
gca 


T 
act 


N 

aat 


L 

ctt 


D 

gat 


L 
tta 


R 
agg 


L 
ctt 


Q 
caa 


N 
aac 


L 

etc 


1 
1 


3318 


p 

ccg 


Q 
caa 


V 
gtc 


G 

ggg 


R 
agg 


F 
ttc 


A 

get 


K 
aaa 


T 
acg 


P 
cct 


R 
cgc 


V 
gtt 


L 
ctt 


R 
aga 


I 
ata 


40 


| 

I 
1 


3363 


P 

ccg 


D 
gat 


K 
aag 


p 

cct 


S 
tct 


I 

ata 


S 
tct 


D 
gat 


L 
ttg 


L 

ctt 


A 
get 


I 

att 


G 

ggg 


R 
cgc 


G 
ggt 




3408 


N 
aat 


D 
gat 


S 
tec 


Y 
tac 


D 
gat 


E 
gaa 


N 
aat 


K 
aaa 


N 
aac 


G 
ggc 


L 
ttg 


L 
Ctt 


V 
gtt 


L 

etc 


D 

gat 


45 


J 

1 
1 

1 


3453 


E 
gag 


C 
tgc 


G 
ggt 


T 
act 


W 
tgg 


F 
ttt 


N 
aat 


T 
acc 


R 
cgt 


S 
tct 


W 
tgg 


N 
aat 


D 
gat 


K 
aag 


E 
gaa 


50 


3498 


R 
aga 


Q 
cag 


P 
ccg 


I 
att 


I 

att 


D 
gat 


W 
tgg 


F 
ttt 


L 
eta 


H 

cat 


A 
get 


R 
cgt 


K 
aaa 


L 
tta 


G 
gga 


1 


3543 


W 
tgg 


D 
gat 


I 

att 


I 
att 


F 
ttt 


L 
ctt 


V 
gtt 


Q 

cag 


D 
gac 


L 
tta 


S 
tct 


I 
att 


V 
gtt 


D 
gat 


K 
aaa 


55 


1 

1 
1 

! 
1 


3588 


Q 
cag 


A 
gcg 


R 
cgt 


S 
tct 


A 
gca 


L 

tta 


A 
get 


E 
gaa 


H 
cat 


V 
gtt 


V 

gtt 


Y 
tat 


C 
tgt 


R 
cgt 


R 

cgt 




3633 


L 
ctg 


D 

gac 


R 
aga 


I 

att 


T 
act 


L 
tta 


P 

cct 


F 

ttt 


V 
gtc 


G 
ggt 


T 

act 


L 
tta 


Y 
tat 


S 
tct 


L 

ctt 


60 


3678 


I 
att 


T 
act 


G 
ggc 


S 
teg 


K 
aaa 


M 
atg 


P 

cct 


L 

ctg 


P 

cct 


K 
aaa 


L 
tta 


H 
cat 


V 
gtt 


G 
ggc 


V 
gtt 
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10 



15 



20 



25 



30 



3723 


V 
gtt 


K 
aaa 


Y 
tat 


G 
ggc 


D 
gat 


S 
tct 


Q 

caa 


L 
tta 


S 
age 


P 

cct 


T 
act 


V 
gtt 


E 
gag 


R 
cgt 


W 
tgg 


3768 


L 
ctt 


Y 
tat 


T 
act 


G 
ggt 


K 
aag 


N 

aat 


L 
ttg 


Y 
tat 


N 
aac 


A 

gca 


Y 
tat 


D 
gat 


T 
act 


K 
aaa 


~Q- 
cag 


3813 


A 
get 


F 
ttt 


S 
tct 


S 
agt 


N 
aat 


Y 
tat 


D 

gat 


S G 
tec ggt 


V 
gtt 


Y 
tat 


S 
tct 


Y 
tat 


L 
tta 


T 
acg 


3858 


P 

cct 


Y 
tat 


L 
tta 


S 
tea 


H 

cac 


G 
ggt 


R 
egg 


Y 
tat 


F 
ttc 


K 
aaa 


P 

cca 


L 
tta 


N 
aat 


L 
tta 


G 
ggt 


3903 


Q 
cag 


K 
aag 


M 
atg 


K 
aaa 


L 
tta 


T 
act 


K 
aaa 


I 
ata 


Y 
tat 


L 
ttg 


K 
aaa 


K 
aag 


F 
ttt 


S 
tct 


R 
cgc 


3948 


V 
att 


L 
ctt 


C 
tat 


L 
ctt 


A 

C1CCS 

y *-y 


I 
att 


G 


F 
ttt 


A 
gca 


S 
tea 


A 
gca 


F 
ttt 


T 
aca 


Y 
tat 


S 


3993 


Y 
tat 


I 

ata 


T 
acc 


Q 
caa 


P 
cct 


K 
aag 


P 

ccg 


E 
gag 


V 
gtt 


K 
aaa 


K 
aag 


V 
gta 


V 
gtc 


S 
tct 


Q 
cag 


4038 


T 
acc 


Y 
tat 


D 
gat 


F 
ttt 


D 
gat 


K 
aaa 


F 
ttc 


T 
act 


I 

att 


D 
gac 


S 
tct 


S 
tct 


Q R 
cag cgt 


L 
ctt 


4083 


N 
aat 


L 
eta 


S 
age 


Y 
tat 


R 
cgc 


Y 
tat 


V 
gtt 


F 
ttc 


K 
aag 


D 
gat 


S 
tct 


K 
aag 


G 
gga 


K 
aaa 


L 
TTA 
Pad 


4128 


I 
ATT 


N 
AAt 


S 
age 


D 
gac 


D 
gat 


L 
tta 


Q 
cag 


K 
aag 


Q 
caa 


G 
ggt 


Y 
tat 


S 
tea 


L 

etc 


T 
aca 


Y 
tat 



Pad 



! lIDLCTVSIKKGNSNE 
35 ! iv Ml K 

4173 att gat tta tgt act gtt tec att aaa aaa ggt aat tea aAT Gaa 
! Start IV 



40 



45 



50 



55 



60 



i 


I 


V 


K 


C 


N 


.End of 


I 
















iv 


L3 


L 


N5 


V 


17 


N 


F 


V10 














4218 


att gtt aaa tgt aat TAA T TTT GTT 










































4243 


ttc 


ttg 


atg 


ttt 


gtt 


tea 


tea 


tct 


tct 


ttt 


get 


cag 


gta 


att 


gaa 


atg 


4291 


aat 


aat 


teg 


cct 


ctg 


cgc 


gat 


ttt 


gta 


act 


tgg 


tat 


tea 


aag 


caa 


tea 


4339 


ggc 


gaa 


tec 


gtt 


att 


gtt 


tct 


ccc 


gat 


gta 


aaa 


ggt 


act 


gtt 


act 


gta 


4387 


tat 


tea 


tct 


gac 


gtt 


aaa 


cct 


gaa 


aat 


eta 


cgc 


aat 


ttc 


ttt 


att 


tct 


4435 


gtt 


tta 


cgt 


gcA 


aat 


aat 


ttt 


gat 


atg 


gtA ggt 


tcT 


aAC 


cct 


tec 


atT 


4483 


att 


cag 


aag 


tat 


aat 


cca 


aac 


aat 


cag 


gat 


tat 


att 


gat 


gaa 


ttg 


cca 


4531 


tea 


tct 


gat 


aat 


cag 


gaa 


tat 


gat 


gat 


aat 


tec 


get 


cct 


tct 


ggt 


ggt 


4579 


ttc 


ttt 


gtt 


ccg 


caa 


aat 


gat 


aat 


gtt 


act 


caa 


act 


ttt 


aaa 


att 


aat 


4627 


aac 


gtt 


egg 


gca 


aag 


gat 


tta 


ata 


cga 


gtt 


gtc 


gaa 


ttg 


ttt 


gta 


aag 


4675 


tct 


aat 


act 


tct 


aaa 


tec 


tea 


aat 


gta 


tta 


tct 


att 


gac 


ggc 


tct 


aat 


4723 


eta 


tta 


gtt 


gtt 


agt 


gcT 


cct 


aaa 


gat 


att 


tta 


gat 


aac 


ctt 


cct 


caa 


4771 


ttc 


ctt 


tcA 


act 


gtt 


gat 


ttg 


cca 


act 


gac 


cag 


ata 


ttg 


att 


gag 


ggt 


4819 


ttg 


ata 


ttt 


gag 


gtt 


cag 


caa 


ggt 


gat 


get 


tta 


gat 


ttt 


tea 


ttt 


get 


4867 


get 


ggc 


tct 


cag 


cgt 


ggc 


act 


gtt 


gca 


ggc ggt 


gtt 


aat 


act 


gac 


cgc 


4915 


etc 


acc 


tct 


gtt 


tta 


tct 


tct 


get 


ggt 


ggt 


teg 


ttc 


ggt 


att 


ttt 


aat 


4963 


ggc 


gat 


gtt 


tta 


ggg 


eta 


tea 


gtt 


cgc 


gca 


tta 


aag 


act 


aat 


age 


cat 


5011 


tea 


aaa 


ata 


ttg 


tct 


gtg 


cca 


cgt 


att 


ctt 


acg 


ctt 


tea 


ggt 


cag 


aag 


5059 


ggt 


tct 


ate 


tct 


gtT 


GGC 


CAg 


aat 


gtc 


cct 


ttt 


att 


act 


ggt 


cgt 


gtg 












Msd. 
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5107 act ggt gaa tct gcc aat gta aat aat cca ttt 
5155 caa aat gta ggt att tec atg age gtt ttt cct 
5203 ggt aat att gtt ctg gat att acc age aag gcc 
5251 tct act cag gca agt gat gtt att act aat caa 
5299 acg gtt aat ttg cgt gat gga cag act ctt tta 
5347 gat tat aaa aac act tct caG gat tct ggc gta 
5395 ate cct tta ate ggc etc ctg ttt age tec cgc 
54 43 gaa age acg tta tac gtg etc gtc aaa gca acc 
54 91 TAG eggegcatt 
End IV 

5503 aagcgcggcg ggtgtggtgg ttacgcgcag cgtgaccgct 
5563 gcccgctcct ttegctttet tcccttcctt tctcgccacg 

5623 agctctaaat cgggggctcc ctttagggtt ccgatttagt 
5683 caaaaaactt gatttgggtg atggttCACG TAGTGggcca 

Drain . . . . 

5743 tcgccctttG ACGTTGGAGT Ccacgttctt taatagtgga 

DrdI 

5803 aacactcaac cctatctcgg gctattcttt tgatttataa 
5863 accaccatca aacaggattt tcgcctgctg gggcaaacca 
5923 ctctctcagg gecaggeggt gaagggcaat CAGCTGttgc 

PvuII. 

5983 aaaaccaccc tGGATCC AAGCTT 

BamHI Hindlll (1/2) 
Insert carrying bla gene 
6006 gcaggtg gcacttttcg gggaaatgtg cgcggaaccc 
6043 ctatttgttt atttttctaa atacattcaa atatGTATCC 

BciVI 

6103 gataaatget tcaataatat tgaaaaAGGA AGAgt 

RBS . ? . . . 

Start bla gene 
6138 ATG agt att caa cat ttc cgt gtc gcc ctt att 
6189 tgc ctt cct gtt ttt get cac cca gaa acg ctg 
6240 gaa gat cag ttg ggC gcA CTA GTg ggt tac ate 

Spel .... 
ApaLI & BssSI Removed 
6291 ggt aag ate ctt gag agt ttt cgc ccc gaa gaa 
6342 act ttt aaa gtt ctg eta tgt GGC GcG Gta tta 
6393 caa gaG CAA CTC GGT CGc cgC ATA cAC tat tct 

Bcgl 

644 4 TAC Tea cca gtc aca gaa aag cat ctt acg gat 
Seal. 

64 95 tta tgc agt get gcc ata acc atg agt gat aac 
654 6 ctg aca aCG ATC Gga gga ccg aag gag eta acc 
Pvul. . . . 

6597 ggg gat cat gta act cgc ctt gat cgt tgg gaa 
6648 ata cca aac gac gag cgt gac acc acg atg cct 
6699 tTG CGC Aaa eta tta act ggc gaa eta ctt act 
Fspl . . . . 



cag acg 
gtt gca 
gat agt 
aga agt 
etc ggt 
ccg ttc 
tct gat 
ata gta 



att gag cgt 
atg get ggc 
ttg agt tct 
att get aca 
ggc etc act 
ctg tct aaa 
tcT aac gag 
cgc gcc ctg 



acacttgcca 
ttcGCCGGCt 

NgoMI . 
getttaegge 
tcgccctgat 



gcgccctagc 
ttccccgtca 

acctcgaccc 
agacggtttt 



ctcttgttcc aaactggaac 

gggattttgc egatttegga 
gcgtggaccg ettgetgeaa 
cCGTCTCact ggtgaaaaga 
BsmBI . 



gctcatgaga caataaccct 



ccc ttt ttt gcg gca ttt 
gtg aaa gta aaa gat get 
gaa ctg gat etc aac age 



cgt ttt cca atg atg age 
tec cgt att gac gcc ggg 
cag aat gac ttg gtt gAG 

Seal 

ggc atg aca gta aga gaa 

act gcg gcc aac tta ctt 
get ttt ttg cac aac atg 

ccg gag ctg aat gaa gcc 
gta gca atg Gca aca acg 
eta get tec egg caa caa 



6750 tta ata gac tgg atg gag gcg gat aaa gtt gca gga cca ctt ctg cgc teg 
6801 GCC ctt ccG GCt ggc tgg ttt att get gat aaa tct gga gcc ggt gag cgt 
Bgll 

6852 gGG TCT Cgc ggt ate att gca gca ctg ggg cca gat ggt aag ccc tec cgt 
Bsal .... 

6903 ate gta gtt ate tac acG ACg ggg aGT Cag gca act atg gat gaa cga aat 

Ahdl 

6954 aga cag ate get gag ata ggt gcc tea ctg att aag cat tgg TAA ctgt 

stop 

7003 cagaccaagt ttactcatat atactttaga ttgatttaaa acttcatttt taatttaaaa 
7063 ggatctaggt gaagatcctt tttgataatc tcatgaccaa aatcccttaa cgtgagtttt 
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10 



15 



20 



25 



30 



7123 cgttccactg tacgtaagac cccc 

7147 AAGCTT GTCGAC tgaa tggcgaatgg cgctttgcct 
! Hindlll Sail.. 

! (2/2) Hindi 

/183 ggtttccggc accagaagcg gtgccggaaa gctggctgga gtgcgatctt 

Start of Fab-display cassette, the Fab DSR-A05, selected for 
binding to a protein antigen. 

7233 CCTGAcG CTCGAG 
xBsu36I Xhol. . 

PlacZ promoter is in the following block 



7246 
7274 
7324 
7374 



Gene 



7418 



7463 



7505 



cgcaacgc aattaatgtg agttagctca 

ctcattaggc accccaggct ttacacttta tgcttccggc tcgtatgttg 

tgtggaattg tgagcggata acaatttcac acaggaaaca gctatgacca 

tgattacgCC AagcttTGGa gccttttttt tggagatttt caac 

Pf 1MI 

Hind3. (there are 3) 
iii signal sequence: 

1 2 3 4 5 6 7 8 9 10 11 12 13 14 

MKKLLFAI PLVVPF 



15 
Y 



gtg aaa aaa tta tta ttc gca att cct tta gtt gtt cct ttc tat 

16 17 18 Start light chain (L20:JK1) 

SHSAQDIQMTQS PA 
tct cac a GT GCA Caa qac ate cag atq acc cag tct cc a gcc 
ApaLI . . . 

Sequence supplied by extender 

T L S L 
acc ctg tct ttg 



35 


1 

7517 

» 


S 
tct 


P 
cca 


G 

ggg 


E 
gaa 


R 
aga 


A 
gcc 


T 
acc 


L 
etc 


S 
tec 


C 
tgc 


R 
agg 


A 

gcc 


S 
agt 


Q 
cag 


G 
Ggt 


40 


i 

7562 

i 


V 
gtt 


S 
age 


s 

age 


Y 
tac 


L 

tta 


A 
gcc 


W 
tgg 


Y 
tac 


Q 
cag 


Q 
cag 


K 
aaa 


P 

cct 


G 
ggc 


Q 
cag 


A 
get 


I 

7607 

i 


P 

ccc 


R 
agg 


L 

etc 


L 
etc 


I 
ate 


Y 
tat 


D 
gAt 


A 

gca 


S 
tec 


S 
aAc 


R 

agg 


A 

gcc 


T 
act 


G 
ggc 


I 

ate 


45 


i 

7652 

I 


P 

cca 


A 
gCc 


R 
agg 


F 
ttc 


S 
agt 


G 
ggc 


S 
agt 


G 

ggg 


P 

Cct 


G 

ggg 


T 
aca 


D 
gac 


F 
ttc 


T 
act 


L 

etc 




i 

7697 

i 


T 
acc 


I 

ate 


S 
age 


S 
agC 


L 
ctA 


E 
gag 


P 
cct 


E 
gaa 


D 
gat 


F 
ttt 


A 
gca 


V 
gtT 


Y 
tat 


Y 
tac 


C 
tgt 


50 


i 

7742 

i 


Q 
cag 


Q 
cag 


R 
CGt 


S 
aAc 


W 
tgg 


H 
cat 


P 
ccg 


W 
tgg 


T 
ACG 


F 
TTC 


G 
GGC 


Q 
CAA 


G 
GGG 


T 
ACC 


R 
AAG 


55 


7787 


V 
gtg 


E 
gaa 


I 

ate 


K 
aaa 


R 
cga 


T 
act 


V 
gtg 


A A 
gCT GCA 
Bsgl . . 


P 
Cca 


S 
tct 


V 
gtc 


F 
ttc 


I 

ate 


F 
ttc 




7832 

j 


P 
ccg 


P 
cca 


S 
tct 


D 
gat 


E 
gag 


Q 
cag 


L 
ttg 


K 
aaa 


S 
tct 


G 
gga 


T 
act 


A 
gcc 


S 
tct 


V 
gtt 


V 
gtg 


60 


j 

7877 


C 
tgc 


L 
ctg 


L 
ctg 


N 
aat 


N 
aac 


F 
ttc 


Y 
tat 


P 
ccc 


R 
aga 


E 
gag 


A 
gcc 


K 
aaa 


V 
gta 


Q 
cag 


W 
tgg 
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K V 


D 


N 


A 


L 


Q 






N 


S 


Q 


E 


S 


V 




aag gtg 


gat 


aac 


gec 


etc 


caa 


teg 


ggt 


aac 


tec 


cag 


gag 


agt 


gtc 




T E 


R 


D 


S 


K 


D 


S 


T 


Y 


S 


L 


S 


S 


T 


7967 


aca gag 


egg 


gac 


age 


aag 


gac 


age 


acc 


tac 


age 


etc 


age 


age 


acc 




L T 


L 


s 


K 


A 


D 


Y 


E 


K 


H 


K 


V 


T" 


A 


8012 


ctg acG 


CTG 


AGC 


aaa 


gca 


gac 


tac 


gag 


aaa 


cac 


aaa 


gtc 


tac 


gee 




Espl. 


























C E 


V 


T 


H 


Q 


G 


L 


S 


S 


P 


V 


T 


K 


S 


8057 


tgc gaa 


gtc 


acc 


cat 


cag 


ggc 


ctG 


AGC 


TCg 


ccc 


gtc 


aca 


aag 


age 
















SacI . 











! F N R G E C 

8102 ttc aac agg gga gag tgt taa taa 

j 

8126 GGCGCG CCaattctat ttcaaGGAGA cagtcata 

! AscI RBS2 . 



PelB signal sequence (22 codons) > 

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 
MKYLLPTAAAGLLLL 
8160 atg aaa tac eta ttg cct acg gca gee get gga ttg tta tta etc 

...PelB signal > Start VH, FR1 > 

16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 
AAQPAMAEVQLLESG 
8205 gcG GCC cag ccG GCC atg gec gaa gtt CAA TTG tta gag tct ggt 

Sfil Mfel... 

Ncol 



8250 



31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 
GGLVQPGGSLRLSCA 
ggc ggt ctt gtt cag cct ggt ggt tct tta cgt ctt tct tgc get 



8295 



. FR1 > CDRl > FR2 > 

46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 
ASGFTFSTYEMRWVR 
get TCC GGA ttc act ttc tct act tac gag atg cgt tgg gtt cgC 

BspEI.. BstXI, 



8340 
BstXI. 



FR2 > CDR2 5 

61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 
QAPGKGLEWVSYIAP 
CAa get ccT GGt aaa ggt ttg gag tgg gtt tct tat ate get cct 

























FR3- 


> 


76 77 


78 


79 


80 


81 


82 


83 


84 


85 86 


87 


88 


89 


90 


S. G 


G 


D 


T 


A 


Y 


A 


D 


S V 


K 


G 


R 


F 


tct ggt 


ggc 


gat 


act 


get 


tat 


get 


gac 


tec gtt 


aaa 


ggt 


cgc 


ttc 


91 92 


93 


94 


95 


96 


97 


98 


99 


100 101 


102 


103 


104 


105 


T I 


S 


R 


D 


N 


S 


K 


N 


T L 


Y 


L 


Q 


M 


act ate 


TCT 


AGA 


aac 


aac 


tct 


aaq 


aat 


act etc 


tac 


ttq 


caq 


atq 



Xbal . . . 

Supplied by extender 
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10 



15 



35 



40 



45 



50 



55 



60 



8475 



106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 

NSLRAEDTAVYYCAR 
aac aaC TTA AGa act aaa aac act aca gtc tac tat tgt gcg agg 
Aflll. . . 

from extender > 



8520 



8565 



CDR3 > FR4 > 

121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 

RLDGYISYYYGMDVW 
agg etc gat ggc tat att tec tac tac tac ggt atg GAC GTC tgg 

Aatll.. 

136 137 138 139 140 141 142 143 144 145 

GQGTTVTV.SS 
ggc caa ggg acc acG GTC ACC gtc tea age 
BstEII... 



20 


1 
1 

8595 


CHI 

A 
gcc 


of IgGl- 

S T 
tec acc 


K 
aag 


G 
ggc 


P 

cca 


S 
teg 


V 
gtc 


F 
ttc 


P 
ccc 


L 
ctg 


A 
gca 


P 

ccc 


S 
tec 


S 
tec 




i 

j 

8640 


K 
aag 


S 
age 


T 
acc 


S 
tct 


G 

ggg 


G 
ggc 


T 
aca 


A 
gcg 


A 

gcc 


L 
ctg 


G 
ggc 


C 
tgc 


L V 
ctg gtc 


K 
aag 


25 


i 
i 

8685 


D 

gac 


Y 
tac 


F 
ttc 


P 

ccc 


E 
gaa 


P 

ccg 


V 
gtg 


T 
acg 


V 
gtg 


S 
teg 


W 
tgg 


N 
aac 


S 
tea 


G 
ggc 


A 
gcc 


30 


i 

8730 

i 


L 
ctg 


T 
acc 


S 
age 


G 
ggc 


V 
gtc 


H 

cac 


T 

acc 


F 
ttc 


P 
ccg 


A 
get 


V 
gtc 


L 
eta 


Q 

cag 


S S 
tCC TCA 
Bsu36I. 




j 
i 

8775 


G 
GGa 


L 
etc 


Y 
tac 


S 
tec 


L 
etc 


S 
age 


S 
age 


V 

gta 


V 
gtg 


T 
acc 


V 
gtg 


P 
ccc 


S 
tec 


S 
age 


S 
age 



Bsu36I 





L 


G 


T 


Q 


T 


Y 


I 


C 


N 


V 


N 


H 


K 


8820 


ttg 


ggc 


acc 


cag 


acc 


tac 


ate 


tgc 


aac 


gtg 


aat 


cac 


aag 




N 


T 


K 


V 


D 


K 


K 


V 


E 


P 


K 


S 


C 


8865 


aac 


acc 


aag 


gtg 


gac 


aag 


aaa 


gtt 


gag 


ccc 


aaa 


tct 


tgt 



Notl. 



AHHHHHHGAAEQKLI 
8910 GCa cat cat cat cac cat cac ggg gcc gca gaa caa aaa etc ate 
..Notl H6 tag Myc-Tag 

SEEDLNGAAqASSA 
8955 tea gaa gag gat ctg aat ggg gcc gca tag GCT AGC tct get 

Myc-Tag • • • Nhel . • . 

Amber 

III 1 stump 

Domain 3 of III 

SGDFDYEKMANANKGA 
8997 agt ggc gac ttc gac tac gag aaa atg get aat gcc aac aaa GGC GCC 

tcctttttag acttggt !W.T. 

KasI...(2/4) 



M 



N 



N 



K 
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9045 atG ACT GAG AAC GCT GAC GAG aat get ttg caa age gat gee aag ggt 

catctacgcag tct c t a c !W.T. 

KLDSVATDYGAAI DGF 
9093 aag tta gac age gTC GCG Ace gac tat GGC GCC gee ATG GAc ggc ttt 

a c t t tct tt tcttt ttc !W.T. 

Nrul Kasl. . . (3/4) 

IGDVSGLANGNGATGD 
9141 ate ggc gat gtc agt ggt tTG GCC Aac ggc aac gga gec acc gga gac 

t t c t tec cctttttt tttt !W.T. 

MscI (3/3) 

FAGSNSQMAQVGDGDN 
9189 ttc GCA GGT tcG AAT TCt cag atg gcC CAG GTT GGA GAT GGg gac aac 

ttct ca tactcttt !W.T. 

BspMI.. (2/2) Xcral 

EcoRI . • . 

SPLMNNFRQYLPSLPQ 
9237 agt ccg ctt atg aac aac ttt aga cag tac ctt ccg tct ctt ccg cag 

tea tta t t cct a tta t c c t a !W.T. 

S V E C R P F V F S AG K P Y E 
9285 agt gtc gag tgc cgt cca ttc gtt ttc tct gec ggc aag cct tac gag 

teg tatcttct age t t a a t a !W.T. 

FSIDCDKINLFR 
9333 ttc aGC Ate gac TGC gat aag ate aat ctt ttC CGC 

t tct t t t c a a eta c t !W.T. 

BstAPI SacII... 

End Domain 3 

GVFAFLLYVATFMYVF 
9369 GGc gtt ttc get ttc ttg eta tac gtc get act ttc atg tac gtt ttc 

tctgtcttattcct tat !W.T. 

start transmembrane segment 

STFANIL RNKES 
9417 aGC ACT TTC GCC AAT ATT TTA Cgc aac aaa gaa age 

tct gttcacg ttgg tct !W.T. 

Intracellular anchor. 



9453 tag tga tct CCT AGG 

Avrll. . 

9468 aag ccc gee taa tga gcg ggc ttt ttt ttt ct ggt 
| Trp terminator I 

End Fab cassette 

9503 ATGCAT CCTGAGG ccgat actgtegteg tcccctcaaa ctggcagatg 

Nsil. . Bsu36I. (3/3) 
9551 cacggttacg atgcgcccat ctacaccaac gtgacctatc ecattaeggt caatccgccg 
9611 tttgttccca eggagaatec gacgggttgt tactcgctca catttaatgt tgatgaaagc 
9671 tggctacagg aaggecagae gcgaattatt tttgatggcg ttcctattgg ttaaaaaatg 
9731 agctgattta acaaaaattt aaTgcgaatt ttaacaaaat attaacgttt acaATTTAAA 

Swa I . . . 

9791 Tatttgctta tacaatcttc ctgtttttgg ggcttttctg attatcaacc GGGGTAcat 
9850 ATG att gac atg eta gtt tta cga tta ccg ttc ate gat tct ctt gtt tgc 
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Start gene II 








9901 


tec 


aga 


etc 


tea 


ggc 


aat 


gac ctg 


9952 


get 


acc 


etc 


tec 


ggc 


atT 


aat tta 


10003 


gat 


ggt 


gat 


ttg 


act 


gtc 


tec ggc 


10054 


aca 


cat 


tac 


tea 


ggc 


att 


gca ttt 


10105 


tat 


cct 


tgc 


gtt 


gaa 


ata 


aag get 


10156 


aat 


gtt 


ttt 


ggt 


aca 


acc 


gat tta 


10207 


aat 


ttt 


get 


aat 


tct 


ttg 


cct tgc 


gene 


II continues 
















— End of Table 
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ata gec ttt gtA GAT CTc tea aaa ata 
Bglll... 

tea get aga acg gtt gaa tat cat att 
ctt tct cac cct ttt gaa tet tta cct 
aaa ata tat gag ggt tct aaa aat ttt 
tct ccc gca aaa gta tta cag ggt cat 
get tta tgc tct gag get tta ttg ctt 
ctg tat gat tta ttg gat gtt ! 
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10 



15 



Table 37: DNA seq of w.t. M13 gene iii 

12 3 4 5 6 7 8 9 10 11 12 13 14 15 
fMKKLLFAI PLVVPFY 
Xw» / ? gtg aaa aaa tta tta ttc gca att cct tta gtt gtt cct ttc —tat 
Signal sequence 

16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 
SHSAETVESCLAKPH 
1624 tct cac tec get gaa act gtt gaa agt tgt tta gca aaa ccc cat 
Signal sequence> Domain 1 

31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 
TENSFTNVWKDDKTL 
1669 aca gaa aat tea ttt act aac gtc tgg aaa gac gac aaa act tta 
Domain 1 — 



46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 
DRYANYEGCLWNATG 
20 1714 gat cgt tac get aac tat gag ggt tgt ctg tgG AAT GCt aca ggc 

BsmI .... 

Domain 1 

61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 
25! VVVCTGDETQCYGTW 
1759 gtt gta gtt tgt act ggt gac gaa act cag tgt tac ggt aca tgg 
Domain 1 

76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 
30! VPIGLAI PENEGGGS 
1804 gtt cct att ggg ctt get ate cct gaa aat gag ggt ggt ggc tct 
Domain 1 > Linker 1 

91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 
35 1 EGGGSEGGGSEGGGT 
184 9 gag ggt ggc ggt tct gag ggt ggc ggt tct gag ggt ggc ggt act 
Linker 1 > 

106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 
40! KPPEYGDTPI PGYTY 
1894 aaa cct cct gag tac ggt gat aca cct att ccg ggc tat act tat 
Domain 2 

121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 
45! IN'PLDGTYPPGTEQN 
1939 ate aac cct etc gac ggc act taT CCG CCt ggt act gag caa aac 

Ecil. . . . 

Domain 2 



50 



55 



60 



136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 
PANPN PSLEESQPLN 
1984 ccc get aat cct aat cct tct ctt GAG GAG tct cag cct ctt aat 

BseRI . . 

Domain 2 

151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 
TFMFQNNRFRNRQGA 
2029 act ttc atg ttt cag aat aat agg ttc cga aat agg cag ggg gca 
Domain 2 

166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



LTVYTGTVTQGT DPV 
2074 tta act gtt tat acg ggc act gtt act caa ggc act gac ccc gtt 
Domain 2 

181 182 183 184 185 186 187 188 189 190 191 192 193 194 1-95 
KTYYQYT PVSSKAMY 
2119 aaa act tat tac cag tac act cct gta tea tea aaa gec atg tat 
Domain 2 

196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 
DAYWNGKFRDCA F H S 
2164 gac get tac tgg aac ggt aaa ttC AGa gaC TGc get ttc cat tct 

AlwNI 

Domain 2 

211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 
GFNEDPFVCEYQGQS 
2209 ggc ttt aat gaG GAT CCa ttc gtt tgt gaa tat caa ggc caa teg 
BamH I . . . 



Domain 2 



Domain 2 



Linker 



Linker 



Linker 2> 



Domain 

301 302 
D A 
gac get 
Domain 



Domain 



Domain 



228 


229 


230 


231 


232 


233 


234 


235 


236 


237 


238 


239 


240 


L 


P 


Q 


p 


P 


V 


N 


A 


G 


G 


G 


S 


G 


ctg cct 


caa 


cct 


cct 


gtc 


aat 


get 


ggc 


ggc 


ggc 


tct 


ggt 


















Linker 2 








243 


244 


245 


246 


247 


248 


249 


250 


251 252 


253 


254 


255 


S 


G 


G 


G 


S 


E 


G 


G 


G 


S 


E 


G 


G 


tct 


ggt 


ggc 


ggc 


tct 


gag 


ggt 


ggt 


ggc 


tct 


gag ggt 


ggc 


258 


259 


260 


261 


262 


263 


264 


265 


266 


267 


268 


269 


270 


E 


G 


G 


G 


S 


E 


G 


G 


G 


S 


G 


G 


G 


gag 


ggt 


ggc 


ggc 


tct 


gag 


gga 


ggc 


ggt 


tec 


ggt 


ggt 


ggc 


273 


274 


275 


276 


277 


278 


279 


280 


281 


282 


283 


284 


285 


S 


G 


D 


F 


D 


Y 


E 


K 


M 


A 


N 


A 


N 


tec ggt 


gat 


ttt 


gat 


tat 


gaa 


aag 


atg 


gca 


aac 


get 


aat 
























288 


289 


290 


291 


292 


293 


294 


295 


296 


297 


298 


299 


300 


A 


M 


T 


E 


N 


A 


D 


E 


N 


A 


L 


Q 


S 


get 


atg 


ace 


gaa 


aat 


gee gat 


gaa 


aac 


gcg 


eta 


cag 


tct 


303 


304 


305 


306 


307 


308 


309 


310 


311 


312 


313 


314 


315 


K 


G 


K 


L 


D 


S 


V 


A 


T 


D 


Y 


G 


A 


aaa 


ggc 


aaa 


ctt 


gat 


tct gtc get 


act 


gat 


tac 


ggt 


get 


318 


319 


320 


321 


322 


323 


324 


325 


326 


327 


328 


329 


330 


D 


G 


F 


I 


G 


D 


V 


S 


G 


L 


A 


N 


G 


gat 


ggt 


ttc 


att 


ggt 


gac gtt 


tec 


ggc 


ctt 


get 


aat 


ggt 


333 


334 


335 


336 


337 


338 


339 


340 


341 


342 


343 


344 


345 


A 


T 


G 


D 


F 


A 


G 


S 


N 


S 


Q 


M 


A 


get 


act 


ggt 


gat 


ttt 


get 


ggc 


tct 


aat 


tec 


caa 


atg 


get 
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! 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 

» QVGDGDNSPLMNNFR 

2614 caa gtc ggt gac ggt gat aat tea cct tta atg aat aat ttc cgt 

! Domain 3 



10 



15 



20 



25 



30 



361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 
QYLPSLPQSVECRPF 
2659 caa tat tta cct tec etc cct caa teg gtt gaa tgt cgc cct ttt 
Domain 3 

376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 
VFSAGKPYEFS I DCD 
2704 gtc ttt age get ggt aaa cca tat gaa ttt tct att gat tgt gac 
Domain 3 



391 392 393 394 395 396 397 398 399 400 401 402 
KI N LFRGVFAFL 
274 9 aaa ata aac tta ttc cgt ggt gtc ttt gcg ttt ctt 
Domain 3 > Transmembrane segment — 

406 407 408 409 410 411 412 413 414 415 416 417 
AT FMYVFSTFAN 
2794 gee ace ttt atg tat gta ttt tct acg ttt get aac 
Transmembrane segment- 



403 404 405 

L Y V 
tta tat gtt 



418 419 420 

I L R 
ata ctg cgt 
> ICA- 



421 
N 

2839 aat 
ICA- 



422 423 424 425 

K E S 
aag gag tct taa 

> 



! 2853 



ICA = intracellular anchor 



End of Table 
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Table 38: Whole mature III anchor M13-III 
derived anchor with recoded DNA 



10 



15 



10 



52 



12 3 

AAA ■ • — ■- 

GCG gcc gca 
Not I 

4 5 6 7 8 9 10 11 12 13 14 15 16 17 
HHHHHHGAAEQKLI 
cat cat cat cac cat cac ggg gcc gca gaa caa aaa etc ate 

18 19 20 21 22 23 24 25 26 27 28 29 
SEEDLNGAA.AS 
tea gaa gag gat ctg aat ggg gcc gca Tag GCT AGC 

Nhel... 



20 



25 



30 



35 



40 



45 



50 



55 



60 



30 31 32 33 34 35 36 37 38 39 
DINDDRM AST 
88 GAT ATC aac gat oat cat atq get tct act 
(ON_G37bot) [RC] S'- c aac oat oat cot atg gcG CAt Get gcc gag aca g-3» 
EcoRV. . 

Enterokinase cleavage site. 

Start mature III (recoded) Domain 1 > 

40 41 42 43 
A E T V 
118 IgcClgaGlacAlgtCI 

t a t t ! W.T. 

44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 
ESCLAKPHTENS FTN 
130 I gaa | TCC I tgC | CTG j GCC | AaG | ccT | caC 1 acT | gaG | aat 1 AGT | ttC | aCA| Aat | 

agt tta a a c t a a tea t t c ! W.T. 

Mscl. . . . 

59 60 61 62 63 64 '65 66 67 68 69 70 71 72 73 
VWKDDKTLDRYANYE 
175 I gtg I TGG I aaG I gaT | gaT | aaG | acC | CtT | gAT I CGA| TaT | gcC | aaT j taC I gaA| 

c accatta tctctg! W.T. 

BspDI . . . 

74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 
GCLWNATGVVVC TGD 
220 | ggC 1 1 gC I TtA I tgg I aat I gcC I ACC I GGC I GtC I gtT | gtC I TGC | ACG 1 ggC | gaT | 

ttcg ta tattttc! W.T. 

SgrAI Bsgl .... 

89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 
E TQCYGTWVPIG LAI 
265 |gaG|acA|caA|tgC|taT|ggC|ACG|TGg|gtG|ccG!atA|gGC|TTA|GCC|atA| 

atgtcta tttgette! W.T. 

Pmll BlpI 

Domain 1 > Linker 1 > 

104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 
PENEGGGSEGGGSEG 
310 I ccG I gaG | aaC I gaA I ggC | ggC I ggT 1 AGC I gaA I ggC | ggT I ggC | AGC I gaA I ggC I 

t a t g t t c tct g t c t tct g t ! W.T. 



Linker 1- 



-> Domain 2- 
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119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 
GGSEGGGTKPPEYGD 
355 | ggT | GGA | TCC | gaA I ggA | ggT | ggA I acC I aaG | ccG I ccG | gaA I t aT | ggC 1 gaC I 

cttgtcttattgctt! W.T. 
B amK I • . ( 2 / 2 } 

134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 
TPIPGYTYINPLDGT 
400 | acT | ccG I atA I CCT i GGT | taC j acC | taC | atT I aaT | ccG | TtA | gaT | ggA | acC | 

attgctt tcctcccct! W.T. 
SexAI .... 



149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 
YPPGTEQNPANPNPS 
15 44 5| t aC } ccT | ccG j ggC I acC I gaA I caG | aaT I ccT I gcC | aaC | ccG I aaC I ccA | AGC | 

TGtttgaccttttt tct ! W.T. 

Hindlll . . . 

164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 
20! LEESQPLNTFMFQNN 
4 90 | TTA | gaA | gaA | AGC I caA | ccG I TtA I aaC I acC | t tT | atg | ttC | caA | aaC I aaC | 

c t G G tct g tct t t c t g t t ! W.T. 

Hindlll. 



25 



30 



35 



179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 
RFRNRQGALTVYTGT 
535 | CgT 1 1 tT | AgG 1 aaC | CgT | caA | gGT | GCT | CtT I acC I gTG I TAC | AcT j ggA I acC J 

ag cca tag g g ata t t t g c t! W.T. 

HgiAI... BsrGI... 

194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 
VTQGTDPVKTYYQYT 
580 | gtC | acC I caG I GGT | ACC | gaT | ccT | gtC | aaG | acC | taC | taT | caA | taT I acC I 

ttactcctattcgct! W.T. 
Kpnl . . . 



209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 
PVSSKAMYDAYWNGK 
625 I ccG | gtC | TCG I AGt I aaG | gcT I atg 1 taC | gaT | gcC | taT | tgg | aaT | ggC | aaG I 
40 ! t a a tea ac tctc eta 

Bsal 

Xhol 



! W.T. 



224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 
45! FRDCAFHSGFNEDPF 
670 I ttT | CgT | gaT | tgT | gcC | ttT | caC I AGC I ggT | ttC I aaC I gaa I gac | CCt | ttT | 
! CAaCctct tct c t t G T a c ! W.T. 



50 



55 



60 



715 



DrdI 



239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 
V CEYQGQSS DLPQPP 
I gtC j t gC I gaG 1 1 aC | caG I ggT | caG | AGT | AGC I gaT | TtA | ccG | caG | ccA I CCG I 

t t a t a c a teg tct c c g t a t t ! W.T. 
.... Agel ..... 



Domain 2 > Linker 2 > 

254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 
VNAGGGSGGGSGGGS 
760 I GTT I AAC | gcG I ggT | ggT I ggT I AGC | ggC I ggA I ggC I AGC | ggC | ggT | ggT | AGC | 

c t t c c c tct t t t tct tec tct ! W.T. 

Agel 

Hpal ... 
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Hindi. 

Linker 2 > Domain 3— > 

269 270 271 272 273 274 275 276 277 2.78 279 280 281 282 283 
EG GGSEGGGSGGG S — G 
805 I gaA | ggC I ggA | ggT | AGC 1 gaA I ggA | ggT | ggC I AGC | ggA | ggC I ggT | AGC I ggC I 

g t t c tct g t c t tct g t c tct t ! W.T. 

Domain 3 > 

284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 
SGDFDYEKMANANKG 
850 I AGT I ggC | gac I tt c I gac | tac I gag I aaa I atg I get | aat | gec I aac I aaa I GGC I 

tec tttttag acttgg! W.T. 

KasI 

299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 
AMTENADENALQSDA 
895 I GCC I atg I act I gag I aac I get I gac I gaG I AAT I GCA | ctg | caa I agt | gat I gCC I 

t catctacgag tct c t ! W.T. 
KasI BsmI Styl... 

314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 
KGKLDSVATDYGAAI 
940 | AAG | GGt | aag | tta | gac I age I gTC I GCc | Aca I gac I tat | ggT | GCt | gec I ate | 

a c act t tct t t t c t ! W.T. 

Styl PflFI 

329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 
DGFI GDVSGLANGNG 
985 I gac I ggc I ttt i ate I ggc I gat | gtc I agt I ggt I ctg I get I aac I ggc I aac I gga | 

t t c t t c t tec cct t t t t ! W.T. 

344 345 346 347 348 349 350 351 352 353 
ATGDFAGSNS 
1030 I gee I ace | gga | gac | ttc I GCA | GGT | tcG I AAT j TCt I 

ttt tttct c! W.T. 

BstBI. . . 

EcoRI . . . 

BspMI . . 

354 355 356 357 358 359 360 361 362 363 
QMAQVGDG DN 
1060 cag atg gcC CAG GTT GGA GAT GGg gac aac 

a tactcttt! W.T. 
Xcml 

364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 
SPLMNNFRQYLPSLPQ 
1090 agt ccg ctt atg aac aac ttt aga cag tac ctt ccg tct ctt ccg cag 

tea tta t t cct a tta t c c t a! W.T. 

380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 
SVECRPFVFSAGKPYE 
1138 agt gtc gag tgc cgt cca ttc gtt ttc tct gee ggc aag cct tac gag 

teg tatcttct age t t a a t a ! W.T. 

Domain 3 > 

396 397 398 399 400 401 402 403 404 405 406 407 
FSIDCDKINLFR 
1186 ttc aGC Ate gac TGC gat aag ate aat ctt ttC CGC 
t tct tttcaacta t 
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BstAPI SacII... 



transmembrane segment > 







408 


409 


410 


411 


412 


413 


414 


415 416 417 418 419 420 


421 422 


423 


5 


! 


G 


V 


F 


A 


F 


L 


L 


Y V A T F M 


Y -V 


F 




1222 


GGc gtt 


ttc 


get 


ttc 


ttg 


eta 


tac gtc get act ttc atg 


tac gtt 


ttc 






t 


c 


t 


g 


t 


c t 


t a 


t t c c t 


t a 


t 






424 


425 


426 


427 


428 


429 


430 


431 432 433 434 435 






10 




S 


T 


F 


A 


N 


I 


L 


R N K E S 








1270 


aGC 


ACT 


TTC 


GCC 


AAT 


ATT 


TTA 


Cgc aac aaa gaa age 










tct 


9 


t 


t 


c 


a 


c g 


t t g g tct 1 


W.T. 




















Intracellular anchor. 






15 


























1306 






tag 


tga 


tct 


CCT 


AGG 






















Avrll. . 








20 


1321 


aag 


ccc 


gcc 


taa 


tga 


gcg 


ggc 


ttt ttt ttt ct ggt 








1 


Trp 


terminator 






1 







End Fab cassette 



End of Table 
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Table 39: ONs to make deletions in III 

! ONs for use with Nhel 

i 

N 

5 (ON G29bot) 5'-c gTT gAT ATc gcT Age cTA-Tgc-3' 

22 ~ 

! this is the reverse complement of 5'-gca tag get age gat ate aac g-3' 

i Nhel... scab 

(ON_G104 top) 5 f -g| ata j ggc I tta | gcT | aGC | ccg | gag I aac I gaa I gg-3 • 
10 30 

! Scab Nhel... 104 105 106 107 108 

(ON_G236top) 5 • -c I ttt I cac I age | ggt I ttc I GCT | AGC | gac I cct I ttt | gtc I tgc-3 • 
37 

. Nhel... 236 237 238 239 240 

15 (ON G236tCS) 5' -c I ttt I cac I age I ggt I ttc I GCT | AGC I gac I cct I ttt | gtc I Agc- 
i " " Nhel... 236 237 238 239 240 

gag | tac | cag | ggt | c-3 1 

50 

! ONs for use with SphI G CAT Gc 
20 (ON_X37bot) 5'-gAc TgT cTc ggc Age ATg cgc cAT Acg ATc ATc gTT g-3' 

37 

j NDDRMAHA 

! (ON X37bot)=[RC] 5 1 -c aac gat gat cgt atg ac G CAt Ge t gee gag aca gtc-3' 

j SphI . . . .Scab 

25 (ONJC104top) 5'-g|gtG ccg | ata } ggc | ttG I CAT | GCa | ccg | gag I aac | gaa | gg-3 ' 
36 

t scab SphI 104 105 106 107 108 

(ON_X236top) 5 1 -c | ttt I cac | age I ggt I ttG I CaT I gCa I gac I cct I ttt | gtc | tgc-3 ' 
37 

30 i SphI 236 237 238 239 240 

(ON X236tCS) 5' -c I ttt | cac | age I ggt I ttG I CaT | gCa | gac I cct I ttt | gtc | Agc- 
i - ~* Nhel... 236 237 238 239 240 

gag | tac I cag I ggt ) c-3 * 

50 
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Table 40: Phage titers and enrichments of a 

selections with a DY3F31-based human Fab library 





input (total cfu) 


output (total cfu) 


Uutput/input 
ratio 


Rl-ox selected on 
phOx-BSA 


4,5 xlO' 2 


3,4 x 10 3 


7,5 x 10 8 


R2-Strep selected 
on Strep-beads 


9,2 x 10 12 


3x 10 8 


3,3 x lO" 5 
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Table 41: Frequency of ELISA positives in 
DY3F31-baseci Fab libraries 





Anti-M13 HRP 


9E10/RAM- 
HRP 


Anti-CK/CL 
Gar-HRP 


R2-ox (with IPTG induction) 


18/44 


10/44 


10/44 


R2-ox (without IPTG) 


13/44 


ND 


ND 


R3-strep (with IPTG) 


39/44 


38/44 


36/44 


R3-strep (without IPTG) 


33/44 


ND 


ND 
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We claim: 

1. A method for cleaving single-stranded 
nucleic acid sequences at a desired location, the 
method comprising the steps of: 
5 (i) contacting the nucleic acid with a 

single-stranded oligonucleotide, the 
oligonucleotide being functionally 
complementary to the nucleic acid in the 
region in which cleavage is desired and 

10 including a sequence that with its complement 

in the nucleic acid forms a restriction 
endonuclease recognition site that on 
restriction results in cleavage of the 
nucleic acid at the desired location; and 

15 (ii) cleaving the nucleic acid solely at 

the recognition site formed by the 
complementation of the nucleic acid and the 
oligonucleotide ; 

the contacting and the cleaving steps being performed 
20 at a temperature sufficient to maintain the nucleic 
acid in substantially single-stranded form, the 
oligonucleotide being functionally complementary to the 
nucleic acid over a large enough region to allow the 
two strands to associate such that cleavage may occur 
25 at the chosen temperature and at the desired location, 
and the cleavage being carried out using a restriction 
endonuclease that is active at the chosen temperature. 



30 



2. A method for cleaving single-stranded 
nucleic acid sequences at a desired location, the 
method comprising the steps of: 
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(i) contacting the nucleic acid with a 
partially double-stranded oligonucleotide, 
the single-stranded region of the 
oligonucleotide being functionally 
5 complementary to the nucleic acid in the 

region in which cleavage is desired, and the 
double-stranded region of the oligonucleotide 
having a restriction endonuclease recognition 
site; and 

10 (ii) cleaving the nucleic acid solely at 

the restriction endonuclease recognition site 
formed by the complementation of the nucleic 
acid and the single-stranded region of the 
oligonucleotide; 

15 the contacting and the cleaving steps being performed 
at a temperature sufficient to maintain the nucleic 
acid in substantially single-stranded form, the 
oligonucleotide being functionally complementary to the 
nucleic acid over a large enough region to allow the 

20 two strands to associate such that cleavage may occur 
at the chosen temperature and at the desired location, 
and the cleavage being carried out using a restriction 
endonuclease that is active at the chosen temperature . 

3. In a method for displaying a member of a 
25 diverse family of peptides, polypeptides or proteins on 
the surface of a genetic package and collectively 
displaying at least a part of the diversity of the 
family, the improvement being characterized in that the 
displayed peptide, polypeptide or protein is encoded at 
30 least in part by a nucleic acid that has been cleaved 
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at a desired location by a method comprising the steps 
of: ~ 

(i) contacting the nucleic acid with a 
single-stranded oligonucleotide, the 

5 oligonucleotide being functionally 

complementary to the nucleic acid in the 
region in which cleavage is desired and 
including a sequence that with its complement 
in the nucleic acid forms a restriction 
10 endonuclease recognition site that on 

restriction results in cleavage of the 
nucleic acid at the desired location; and 

(ii) cleaving the nucleic acid solely at 
the recognition site formed by the 

15 complementation of the nucleic acid and the 

oligonucleotide; 

the contacting and the cleaving steps being performed 
at a temperature sufficient to maintain the nucleic 
acid in substantially single-stranded form, the 

20 oligonucleotide being functionally complementary to the 
nucleic acid over a large enough region to allow the 
two strands to associate such that cleavage may occur 
at the chosen temperature and at the desired location, 
and the cleavage being carried out using a restriction 

25 endonuclease that is active at the chosen temperature. 

4 . In a method for displaying a member of a 
diverse family of peptides, polypeptides or proteins on 
the surface of a genetic package and collectively 
displaying at least a part of the diversity of the 
30 family, the improvement being characterized in that the 
displayed peptide, polypeptide or protein is encoded by 



WO 02/083872 



PCT/US02/12405 



- 224 - 

a DNA sequence comprising a nucleic acid that has been 
cleaved at a desired location by 

(i) contacting the nucleic acid with a 
partially double-stranded oligonucleotide, 

5 the single-stranded region of the 

oligonucleotide being functionally 
complementary to the nucleic acid in the 
region in which cleavage is desired, and the 
double-stranded region of the oligonucleotide 
10 having a restriction endonuclease recognition 

site; and 

(ii) cleaving the nucleic acid solely at 
the restriction endonuclease recognition 
cleavage site formed by the complementation 

15 of the nucleic acid and the single-stranded 

region of the oligonucleotide; 

the contacting and the cleaving steps being performed 
at a temperature sufficient to maintain the nucleic 
acid in substantially single-stranded form, the 

20 oligonucleotide being functionally complementary to the 
nucleic acid over a large enough region to allow the 
two strands to associate such that cleavage may occur 
at the chosen temperature and at the desired location, 
and the cleavage being carried out using a restriction 

25 endonuclease that is active at the chosen temperature. 

5. A method for displaying a member of a 
diverse family of peptides, polypeptides or proteins on 
the surface of a genetic package and collectively 
displaying at least a part of the diversity of the 
30 family, the method comprising the steps of: 
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(i) preparing a collection of nucleic acids 
that code at least in part for members of the diverse 
family; 

(ii) rendering the nucleic acids single- 

5 stranded; 

(iii) cleaving the single-stranded nucleic 
acids at a desired location by a method comprising the 
steps of: 

(a) contacting the nucleic acid with a 
10 single-stranded oligonucleotide, the 

oligonucleotide being functionally 
complementary to the nucleic acid in the 
region in which cleavage is desired and 
including a sequence that with its complement 
15 in the nucleic acid forms a restriction 

endonuclease recognition site that on 
restriction results in cleavage of the 
nucleic acid at the desired location; and 

(b) cleaving the nucleic acid solely at 
20 the recognition site formed by the 

complementation of the nucleic acid and the 
oligonucleotide; 
the contacting and the cleaving steps being 
performed at a temperature sufficient to maintain 

25 the nucleic acid in substantially single-stranded 

form, the oligonucleotide being functionally 
complementary to the nucleic acid over a large 
enough region to allow the two strands to 
associate such that cleavage may occur at the 

30 chosen temperature and at the desired location, 

and the cleavage being carried out using a 
restriction endonuclease that is active at the 
chosen temperature; and 
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(iv) displaying a member of the family of 



peptides, polypeptides or proteins coded, at least in 
part, by the cleaved nucleic acids on the surface of 
the genetic package and collectively displaying at 
5 least a portion of the diversity of the family. 



diverse family of peptides, polypeptides or proteins on 
the surface of a genetic package and collectively 
displaying at least a portion of the diversity of the 
10 family, the method comprising the steps of: 

(i) preparing a collection of nucleic acids 
that code, at least in part, for members of the diverse 
family; 

(ii) rendering the nucleic acids single- 

15 stranded; 

(iii) cleaving the single-stranded nucleic 
acids at a desired location by a method comprising the 
steps of: 



6. 



A method for displaying a member of a 



25 



20 



(a) contacting the nucleic acid with a 
partially double-stranded oligonucleotide, 
the single-stranded region of the 
oligonucleotide being functionally 
complementary to the nucleic acid in the 
region in which cleavage is desired, and the 
double-stranded region of the oligonucleotide 
having a restriction endonuclease recognition 
site; and 



30 



(b) cleaving the nucleic acid solely at 
the restriction endonuclease recognition 
cleavage site formed by the complementation 
of the nucleic acid and the single-stranded 
region of the oligonucleotide; 
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the contacting and the cleaving steps being 
performed at a temperature sufficient to maintain 
the nucleic acid in substantially single-stranded 
form, the oligonucleotide being functionally 
5 complementary to the nucleic acid over a large 

enough region to allow the two strands to 
associate such that cleavage may occur at the 
chosen temperature and at the desired location, 
and the restriction being carried out using a 
10 cleavage endonuclease that is active at the chosen 

temperature; and 

(iv) displaying a member of the family of 
peptides, polypeptides or proteins coded, at least in 
part, by the cleaved nucleic acids on the surface of 
15 the genetic package and collectively displaying at 
least a portion of the diversity of the family. 

7. In a method for expressing a member of a 
diverse family of peptides, polypeptides or proteins 
and collectively expressing at least a part of the 
20 diversity of the family, the improvement being 
characterized in that the expressed peptide, 
polypeptide or protein is encoded at least in part by a 
nucleic acid that has been cleaved at a desired 
location by a method comprising the steps of: 
25 (i) contacting the nucleic acid with a 

single-stranded oligonucleotide, the 
oligonucleotide being functionally 
complementary to the nucleic acid in the 
region in which cleavage is desired and 
30 including a sequence that with its complement 

in the nucleic acid forms a restriction 
endonuclease recognition site that on 
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restriction results in cleavage of the 
nucleic acid at the desired locations and 

(ii) cleaving the nucleic acid solely at 
the recognition site formed by the 
5 complementation of the nucleic acid and the 

oligonucleotide; 

the contacting and the cleaving steps being performed 
at a temperature sufficient to maintain the nucleic 
acid in substantially single-stranded form f the 

10 oligonucleotide being functionally complementary to the 
nucleic acid over a large enough region to allow the 
two strands to associate such that cleavage may occur 
at the chosen temperature and at the desired location, 
and the cleavage being carried out using a restriction 

15 endonuclease that is active at the chosen temperature. 



8. In a method for expressing a member of a 
diverse family of peptides, polypeptides or proteins 
and collectively expressing at least a part of the 
diversity of the family, the improvement being 
20 characterized in that the expressed peptide, 

polypeptide or protein is encoded by a DNA sequence 
comprising a nucleic acid that has been cleaved at a 
desired location by 

(i) contacting the nucleic acid with a 
25 partially double-stranded oligonucleotide, 

the single-stranded region of the 
oligonucleotide being functionally 
complementary to the nucleic acid in the 
region in which cleavage is desired, and the 
30 double-stranded region of the oligonucleotide 
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having a restriction endonuclease recognition 

site; and . 

(ii) cleaving the nucleic acid solely at 
the restriction endonuclease recognition 
5 cleavage site formed by the complementation 

of the nucleic acid and the single-stranded 
region of the oligonucleotide; 

the contacting and the cleaving steps being performed 
at a temperature sufficient to maintain the nucleic 

10 acid in substantially single-stranded form, the 

oligonucleotide being functionally complementary to the 
nucleic acid over a large enough region to allow the 
two strands to associate such that cleavage may occur 
at the chosen temperature and at the desired location, 

15 and the cleavage being carried out using a restriction 
endonuclease that is active at the chosen temperature. 

9. A method for expressing a member of a 
diverse family of peptides, polypeptides or proteins 
and collectively expressing at least a part of the 
20 diversity of the family, the method comprising the 
steps of: 

(i) preparing a collection of nucleic acids 
that code at least in part for members of the diverse 
family; 

25 (ii) rendering the nucleic acids single- 

stranded; 

(iii) cleaving the single-stranded nucleic 
acids at a desired location by a method comprising the 
steps of: 

30 (a) contacting the nucleic acid with a 

single-stranded oligonucleotide, the 
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oligonucleotide being functionally 
complementary to the nucleic acid in— the 
region in which cleavage is desired and 
including a sequence that with its complement 
5 in the nucleic acid forms a restriction 

endonuclease recognition site that on 
restriction results in cleavage of the 
nucleic acid at the desired location; and 

(b) cleaving the nucleic acid solely at 
10 the recognition site formed by the 

complementation of the nucleic acid and the 
oligonucleotide; 
the contacting and the cleaving steps being 
performed at a temperature sufficient to maintain 
15 the nucleic acid in substantially single-stranded 

form, the oligonucleotide being functionally 
complementary to the nucleic acid over a large 
enough region to allow the two strands to 
associate such that cleavage may occur at the 
20 chosen temperature and at the desired location, 

and the cleavage being carried out using a 
restriction endonuclease that is active at the 
chosen temperature; and 

(iv) expressing a member of the family of 
25 peptides, polypeptides or proteins coded, at least in 
part, by the cleaved nucleic acids and collectively 
expressing at least a portion of the diversity of the 
family. 

10 • A method for expressing a member of a 
30 diverse family of peptides, polypeptides or proteins 
and collectively expressing at least a portion of the 
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diversity of the family, the method comprising the 

steps of ; . 

(i) preparing a collection of nucleic acids 
that code, at least in part, for members of the diverse 

5 family; 

(ii) rendering the nucleic acids single- 
stranded; 

(iii) cleaving the single-stranded nucleic 
acids at a desired location by a method comprising the 

10 steps of: 

(a) contacting the nucleic acid with a 
partially double-stranded oligonucleotide, 
the single-stranded region of the 
oligonucleotide being functionally 
15 complementary to the nucleic acid in the 

region in which cleavage is desired, and the 
double-stranded region of the oligonucleotide 
having a restriction endonuclease recognition 
site; and 

20 (b) cleaving the nucleic acid solely at 

the restriction endonuclease recognition 
cleavage site formed by the complementation 
of the nucleic acid and the single-stranded 
region of the oligonucleotide; 
25 the contacting and the cleaving steps being 

performed at a temperature sufficient to maintain 
the nucleic acid in substantially single-stranded 
form, the oligonucleotide being functionally 
complementary to the nucleic acid over a large 
enough region to allow the two strands to 
associate such that cleavage may occur at the 
chosen temperature and at the desired location, 
and the restriction being carried out using a 



30 
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cleavage endonuclease that is active at the chosen 
temperature; and — •■ 

<iv) expressing a member of the family of 
peptides, polypeptides or proteins coded, at least in 
5 part, by the cleaved nucleic acids and collectively 
expressing at least a portion of the diversity of the 
family. 



genetic packages that display a member of a diverse 
10 family of peptides, polypeptides or proteins and 
collectively display at least a portion of the 
diversity of the family, the library being produced 
using the methods of claims 3, 4, 5 or 6. 



15 genetic packages that display a member of a diverse 
family of peptides, polypeptides or proteins and that 
collectively display at least a portion of the family, 
the displayed peptides, polypeptides or proteins being 
encoded by DNA sequences comprising at least in part 

20 sequences produced by cleaving single-stranded nucleic 
acid sequences at a desired location by a method 
comprising the steps of: 



11. 



A library comprising a collection of 



12. A library comprising a collection of 



25 



(i) contacting the nucleic acid with a 
single-stranded oligonucleotide, the 
oligonucleotide being functionally 
complementary to the nucleic acid in the 
region in which cleavage is desired and 
including a sequence that with its complement 
in the nucleic acid forms a restriction 



30 



endonuclease recognition site that on 
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restriction results in cleavage of the 
nucleic acid at the desired locationr and 

(ii) cleaving the nucleic acid solely at 
the recognition site formed by the 
5 complementation of the nucleic acid and the 

oligonucleotide; 

the contacting and the cleaving steps being performed 
at a temperature sufficient to maintain the nucleic 
acid in substantially single-stranded form, the 

10 oligonucleotide being functionally complementary to the 
nucleic acid over a large enough region to allow the 
two strands to associate such that cleavage may occur 
at the chosen temperature and at the desired location, 
and the cleavage being carried out using a restriction 

15 endonuclease that is active at the chosen temperature. 

13. A library comprising a collection of 
genetic packages that display a member of a diverse 
family of peptides, polypeptides or proteins and that 
collectively display at least a portion of the 
diversity of the family of the displayed peptides, 
polypeptides or proteins being encoded by DNA sequences 
comprising at least in part sequences produced by- 
cleaving single-stranded nucleic acid sequences at a 
desired location by a method comprising the steps of: 
(i) contacting the nucleic acid with a 
partially double-stranded oligonucleotide, 
the single-stranded region of the 
oligonucleotide being functionally 
complementary to the nucleic acid in the 
region in which cleavage is desired, and the 
double-stranded region of the oligonucleotide 



20 



25 



30 
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having a restriction endonuclease recognition 
site; and — • 

(ii) cleaving the nucleic acid solely at 
the restriction endonuclease recognition 
5 cleavage site formed by the complementation 

of the nucleic acid and the single-stranded 
region of the oligonucleotide; 
the contacting and the cleaving steps being performed 
at a temperature sufficient to maintain the nucleic 
10 acid in substantially single-stranded form, the 

oligonucleotide being functionally complementary to the 
nucleic acid over a large enough region to allow the 
two strands to associate such that cleavage may occur 
at the chosen temperature and at the desired location, 
15 and the cleavage being carried out using a restriction 
endonuclease that is active at the chosen temperature. 

14. A library comprising a collection of 
members of a diverse family of peptides, polypeptides 
or proteins and collectively comprising at least a 

20 portion of the diversity of the family, the library 
being produced using the methods of claims 7, 8, 9 or 
10. 

15. A library comprising a collection of 
members of a diverse family of peptides, polypeptides 

25 or proteins and collectively comprising at least a 
portion of diversity of the family, the peptides, 
polypeptides or proteins being encoded by DNA sequences 
comprising at least in part sequences produced by 
cleaving single-stranded nucleic acid sequences at a 

30 desired location by a method comprising the steps of: 
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(i) contacting the nucleic acid with a 
single-stranded oligonucleotide, the — 
oligonucleotide being functionally 
complementary to the nucleic acid in the 

5 region in which cleavage is desired and 

including a sequence that with its complement 
in the nucleic acid forms a restriction 
endonuclease recognition site that on 
restriction results in cleavage of the 
10 nucleic acid at the desired location; and 

(ii) cleaving the nucleic acid solely at 
the recognition site formed by the 
complementation of the nucleic acid and the 
oligonucleotide; 



15 the contacting and the cleaving steps being performed 
at a temperature sufficient to maintain the nucleic 
acid in substantially single-stranded form, the 
oligonucleotide being functionally complementary to the 
nucleic acid over a large enough region to allow the 

20 two strands to associate such that cleavage may occur 
at the chosen temperature and at the desired location, 
and the cleavage being carried out using a restriction 
endonuclease that is active at the chosen temperature. 

16. A library comprising a collection of 
25 members of a diverse family of peptides, polypeptides 
or proteins and collectively comprising at least a 
portion of the diversity of the family, the peptides, 
polypeptides or proteins being encoded by DNA sequences 
comprising at least in part sequences produced by 
30 cleaving single-stranded nucleic acid sequences at a 
desired location by a method comprising the steps of: 
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(i) contacting the nucleic acid with a 
partially double-stranded oligonucleotide, 
the single-stranded region of the 
oligonucleotide being functionally 
5 complementary to the nucleic acid in the 

region in which cleavage is desired, and the 
double-stranded region of the oligonucleotide 
having a restriction endonuclease recognition 
site; and 

10 (ii) cleaving the nucleic acid solely at 

the restriction endonuclease recognition 
cleavage site formed by the complementation 
of the nucleic acid and the single-stranded 
region of the oligonucleotide; 
15 the contacting and the cleaving steps being performed 
at a temperature sufficient to maintain the nucleic 
acid in substantially single-stranded form, the 
oligonucleotide being functionally complementary to the 
nucleic acid over a large enough region to allow the 
20 two strands to associate such that cleavage may occur 
at the chosen temperature and at the desired location, 
and the cleavage being carried out using a restriction 
endonuclease that is active at the chosen temperature. 

25 17. A library of claims 11, 12 or 13 wherein 

the genetic packages are selected from the group of 
phage, phagemid or yeast. 

18. A library of claims 17 wherein the 
genetic packages are selected are phage or phagemid. 



30 19. The methods or libraries according 

claims 2, 4, 6, 8, 10, 13 or 16 wherein in the 
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restriction endonuclease recognition site is for a 
Type II-S restriction endonuciease. — 

20. The methods or libraries according to 
claims 1 to 19, wherein the nucleic acid is cDNA. 

5 21. The methods or libraries according to 

any one of claims 1 to 20 f wherein the nucleic acids 
encode at least a portion of an immunoglobulin. 

22. The methods or libraries according to 
claim 21, wherein the immunoglobulin comprises a Fab or 

10 single chain Fv. 

23. The methods or libraries according to 
claim 21 or 22 , wherein the immunoglobulin comprises at 
least portion of a heavy chain. 

24. The method or libraries according to 

15 claim 23, wherein the heavy chain is IgM, IgG, IgA, IgE 
or IgD. 

25. The methods or libraries according to 
claim 23 or 24 , wherein at least a portion of the heavy 
chain is human. 

20 26. The methods or libraries according to 

claim 21 or 22, wherein the immunoglobulin comprises at 
least a portion of FR1. 

27. The methods or libraries according to 
claim 26, wherein at least a portion of the FR1 is 
25 human. 
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28. The methods or libraries according to 
claim 21 or 22, wherein the immunoglobulin .comprises at 
least a portion of a light chain. 

5 29. The methods or libraries according to 

claim 28, wherein at least a portion of the light chain 
is human. 



30. The methods or libraries according to 
any one of claims 1 to 16, wherein the nucleic acid 

10 sequences are at least in part derived from patients 
suffering from at least one autoimmune disease and/or 
cancer. 

31. The methods or libraries according to 
claim 30, wherein the autoimmune disease is selected 

15 from the group comprising lupus, erythematosus, 
systemic sclerosis, rheumatoid arthritis, 
antiphosolipid syndrome or vasculitis. 



32. The methods or libraries according to 
claim 30, wherein the nucleic acids are at least in 

20 part isolated from the group comprising peripheral 
blood cells, bone marrow cells spleen cells or lymph 
node cells. 

33. The methods according to claim 5, 6, 9 
or 10 further comprising at least one nucleic acid 

25 amplification step between one or more of steps (i) and 
(ii), steps (ii) and (iii) or between steps (iii) and 
(iv) . 
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34. The method according to claim 33, 
wherein amplification primers for the amplification 
step are functionally complementary to a constant 
region of the nucleic acids. 

5 35. The method according to claim 34 , 

wherein the constant region is genetically constant in 
the nucleic acids. 



36. The method according to claim 35, 
wherein the genetically constant region is a part of 
the genome of immunoglobulin genes selected from the 
group of IgM, IgG, IgA, IgE or IgD. 



10 



37. The method according to claim 34, 
wherein the constant region is exogenous to the nucleic 
acids. 

15 38. The methods according to claim 33, 

wherein the amplification step uses geneRACE™. 

39. The methods or libraries according to 
any one of claims 1 to 16, wherein the chosen 
temperature is between 37 °C and 75 °C 

20 40. The methods or libraries according to 

claim 39, wherein the chosen temperature is between 
45°C and 75°C. 



41. The methods or libraries according to 
claim 40, wherein the chosen temperature is between 
25 50°C and 60°C. 
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42. The methods or libraries according to 
claim 41, wherein the chosen temperature is between 
55°C and 60°C. 

43. The methods or libraries according to 
5 claim 1, 3, 5, 7, 9, 12 or 15, wherein the length of 

the single-stranded oligonucleotide is between 17 and 
30 bases. 

44. The methods or libraries according to 
claim 43, wherein the length of the single-stranded 

10 oligonucleotide is between 18 and 24 bases. 

45. The methods or libraries according to 
claim 1, 3, 5, 7, 9, 12 or 15, wherein the restriction 
endonuclease is selected from the group comprising 
Maelll, Tsp45I, HphI, BsaJI, Alul, BlpI, Ddel, Bglll, 

15 Ms II, BsiEI, Eael, EagI, tfaelll, BstiCI, HpyCH4III, 

tfinfl, Mlyl, Plel, Mnll, HpyCMV, Bsmhl, Bpml, XmnI, or 
Sacl. 

46. The methods" or libraries according to 
claim 45, wherein the restriction endonuclease is 

20 selected from the group comprising Bst4CI, Taal, 
HpyCH4III, BlpI, HpyCH4V or Ms II. 

47. The methods or libraries according to 
claim 2, 4, 6, 8, 10, 13 or 16, wherein the length of 
the single-stranded region of the partially double- 

25 stranded oligonucleotide is between 14 and 22 bases. 

48. The methods or libraries according to 
claim 47, wherein the length of the single-stranded 
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region of the partially double-stranded oligonucleotide 
is between 14 and 17 bases. — •■ 

49. The methods or libraries according to 
claim 47 r wherein the length of the single-stranded 

5 region of the oligonucleotide is between 18 and 20 
bases. 

50. The methods or libraries according to 
claim 2, 4, 6, 8, 10, 13 or 16, wherein the length of 
the double-stranded region of the partially double- 

10 stranded oligonucleotide is between 10 and 14 base 
pairs formed by a stem and its palindrome. 

51. The methods or libraries according to 
claim 50 wherein, the partially double-stranded 
oligonucleotide comprises a loop of 3 to 8 bases 

15 between the stem and the palindrome. 

52. The methods or libraries according to 
claim 19 wherein the Type II-S restriction endonuclease 
is selected from the group comprising AarlCAC, Acelll, 
Bbr7I, Bbvl, BbvII, Bce83I, BceAI, Bcefl, BciVI, Bfil, 

20 BinI, BscAI, BseRI, BsmFI, BspMI f Ecil, Eco57I, Faul, 
Fokl, Gsul, Hgal, HphI, MboII, Mlyl, Mmel, Mnll, Plel, 
RleAI, SfaNI, SspDSI, Sthl32I, StsI, Taqll, Tthlllll, 
or UbaPI. 

53. The methods or libraries according to 
25 claim 52, wherein the Type II-S restriction 

endonuclease is Fokl. 
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54. A method for preparing single-stranded 
nucleic acids , the method comprising the steps^of : 

(i) contacting a single-stranded nucleic 
acid sequence that has been cleaved with a 

5 restriction endonuclease with a partially 

double-stranded oligonucleotide, the single- 
stranded region of the oligonucleotide being 
functionally complementary to the nucleic 
acids in the region that remains after 

10 cleavage, the double-stranded region of the 

oligonucleotide including any sequences 
necessary to return the sequences that remain 
after cleavage into proper and original 
reading frame for expression and containing a 

15 restriction endonuclease recognition site 5 1 

of those sequences; and 

(ii) cleaving the partially double- 
stranded oligonucleotide sequence solely at 
the restriction endonuclease recognition site 

20 contained within the double-stranded region 

of the partially double-stranded 
oligonucleotide . 

the contacting and the cleaving steps being performed 
at a temperature sufficient to maintain the nucleic 

25 acid in substantially single-stranded form, the 

oligonucleotide being functionally complementary to the 
nucleic acid over a large enough region to allow the 
two strands to associate such that cleavage may occur 
at the chosen temperature and at the desired location, 

30 and the cleavage being carried out using a restriction 
endonuclease that is active at the chosen temperature. 
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55. The method according to claim 54, 
wherein the length ot the single-stranded portion of 
the partially double-stranded oligonucleotide is 
between 2 and 15 bases. 



5 56. The method according to claim 55, 

wherein the length of the single-stranded portion of 
the partially double-stranded oligonucleotide is 
between 7 and 10 bases. 

57. The method according to claim 54, 
10 wherein the length of the double-stranded portion of 
the partially double-stranded oligonucleotide is 
between 12 and 100 base pairs. 



58. The method according to claim 57, 
wherein the length of the double-stranded portion of 
15 the partially double-stranded oligonucleotide is 
between 20 and 100 base pairs. 



59. A method for preparing a library 
comprising a collection of genetic packages that 
display a member of a diverse family of peptides, 
20 polypeptides or proteins and that collectively display 
at least a portion of the family comprising the steps: 
(i) preparing a collection of nucleic acids 
that code at least in part for members of the diverse 
family; 

25 (ii) rendering the nucleic acids single- 

stranded; 

(iii) cleaving the single-stranded nucleic 
acids at a desired location by a method comprising the 
steps of: 
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(a) contacting the nucleic acid with a 
single-stranded oligonucleotide, the— 
oligonucleotide being functionally 
complementary to the nucleic acid in the 

5 region in which cleavage is desired and 

including a sequence that with its complement 
in the nucleic acid forms a restriction 
endonuclease recognition site that on 
restriction results in cleavage of the 
10 nucleic acid at the desired location; and 

(b) cleaving the nucleic acid solely at 
the recognition site formed by the 
complementation of the nucleic acid and the 
oligonucleotide; 

15 the contacting and the cleaving steps being 

performed at a temperature sufficient to maintain- 
the nucleic acid in substantially single-stranded 
form, the oligonucleotide being functionally 
complementary to the nucleic acid over a large 
20 enough region to allow the two strands to 

associate such that cleavage may occur at the 
chosen temperature and at the desired location, 
and the cleavage being carried out using a 
restriction endonuclease that is active at the 
25 chosen temperature; 

(iv) contacting the nucleic acid with a 
partially double-stranded oligonucleotide, the single- 
stranded region of the oligonucleotide being 
functionally complementary to the nucleic acids in the 
30 region that remains after the cleavage in step (iii) 
has been effected, and the double-stranded region of 
the oligonucleotide including any sequences necessary 
to return the sequences that remain after cleavage into 
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proper and original reading frame for display and 
containing a restriction endonuciease recognition site 
5 1 of those sequences that is different from the 
restriction site used in step (iii); and 
5 (v) cleaving the nucleic acid solely at the 

restriction endonuciease recognition cleavage site 
contained within the double-stranded region of the 
partially double-stranded oligonucleotide; 

the contacting and the cleaving steps being 
10 performed at a temperature sufficient to maintain 

the nucleic acid in substantially single-stranded 
form, the oligonucleotide being functionally 
complementary to the nucleic acid over a large 
enough region to allow the two strands to 
15 associate such that cleavage may occur at the 

chosen temperature and at the desired location, 
and the restriction being carried out using a 
cleavage endonuciease that is active at the chosen 
temperature; and 
20 (vi) displaying a member of the family of 

peptides, polypeptides or proteins coded, at least in 
part, by the cleaved nucleic acids on the surface of 
the genetic package and collectively displaying at 
least a portion of the diversity of the family. 

25 60. A method for preparing a library 

comprising a collection of members of a diverse family 
of peptides, polypeptides or proteins and collectively 
comprising at least a portion of the family comprising 
the steps: 

30 (i) preparing a collection of nucleic acids 

that code at least in part for members of the diverse 
family; 
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(ii) rendering the nucleic acids single- 
stranded; — ■ 

(iii) cleaving the single-stranded nucleic 
acids at a desired location by a method comprising the 

5 steps of: 

(a) contacting the nucleic acid with a 
single-stranded oligonucleotide, the 
oligonucleotide being functionally 
complementary to the nucleic acid in the 

10 region in which cleavage is desired and 

including a sequence that with its complement 
in the nucleic acid forms a restriction 
endonuclease recognition site that on 
restriction results in cleavage of the 

15 nucleic acid at the desired location; and 

(b) cleaving the nucleic acid solely at 
the recognition site formed by the 
complementation of the nucleic acid and the 
oligonucleotide; 

20 the contacting and the cleaving steps being 

performed at a temperature sufficient to maintain 
the nucleic acid in substantially single-stranded 
form, the oligonucleotide being functionally 
complementary to the nucleic acid over a large 

25 enough region to allow the two strands to 

associate such that cleavage may occur at the 
chosen temperature and at the desired location, 
and the cleavage being carried out using a 
restriction endonuclease that is active at the 

30 chosen temperature; 

(iv) contacting the nucleic acid with a 
partially double-stranded oligonucleotide, the single- 
stranded region of the oligonucleotide being 
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functionally complementary to the nucleic acids in the 
region that remains after the cleavage in step-(iii) 
has been effected, and the double-stranded region of 
the oligonucleotide including any sequence necessary to 
5 return the sequences that remain after cleavage into 
proper and original reading frame for expression and 
containing a restriction endonuclease recognition site 
5 1 of those sequences that is different from the 
restriction site used in step <iii); and 
10 (v) cleaving the nucleic acid solely at the 

restriction endonuclease recognition cleavage site 
_ contained within the double-stranded region of the 
partially double-stranded oligonucleotide; 

the contacting and the cleaving steps being 
15 performed at a temperature sufficient to maintain 

the nucleic acid in substantially single-stranded 
form, the oligonucleotide being functionally 
complementary to the nucleic acid over a large 
enough region to allow the two strands to 
20 associate such that cleavage may occur at the 

chosen temperature and at the desired location, 
and the restriction being carried out using a 
cleavage endonuclease that is active at the chosen 
temperature; and 
25 (vi) expressing a member of the family of 

peptides, polypeptides or proteins coded, at least in 
part, by the cleaved nucleic acids and collectively 
expressing at least a portion of the diversity of the 
family. 

30 61. The methods according to claim 59 or 60, 

further comprising at least one nucleic acid 
amplification step between one or more of steps (i) and 
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(ii), steps (ii) and (iii) , steps {iii) and (iv) and 
steps (iv) and (v) . 

62. A library comprising a collection of 
genetic packages that display a member of a diverse 

5 family of peptides, polypeptides or proteins and 
collectively display at least a portion of the 
diversity of the family, the library being produced 
using the methods of claims 59 or 61. 

63. A library comprising a collection of 
10 members of a diverse family of peptides, polypeptides 

or proteins and collectively comprise at least a 
portion of the diversity of the family, the library 
being produced using the methods of claims 60 or 61. 

64 . The methods and libraries according to 
15 any one of claim 59 to 63, wherein the members of the 

library encode immunoglobulins. 

65. The method and libraries according to 
claim 64, wherein the double-stranded region of the 
oligonucleotide encodes at least a part of a framework 

20 sequence of an immunoglobulin. 

66. The method and libraries according to 
claim 65, wherein the framework sequence comprises 
framework 1 of an antibody. 

67. The method and libraries according to 
25 claim 66, wherein the framework sequence comprises 

framework 1 of a variable domain of a light chain. 
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68. The method and libraries according to 
claim 66, wherein the framework sequence comprises 
framework 1 of a variable domain of a heavy chain. 

69. The method and libraries according to 
5 claim 65, wherein the framework sequence comprises 

framework 3 of an antibody. 

70. The method and libraries according to 
claim 69, wherein the framework sequence comprises 
framework 3 of a variable domain of a light chain. 

10 71. The method and libraries according to 

claim 69, wherein the framework sequence is framework 3 
of a variable domain of a heavy chain. 

72. The method and libraries according to 
claim 66, wherein the 5 1 primer is complementary to a 

15 region outside framework 1. 

73. The method according to claim 61, 
wherein amplification primers for the amplification 
step are functionally complementary to a constant 
region of the nucleic acids. 

20 74. The method according to claim 73, 

wherein the constant region is genetically constant in 
the nucleic acids. 

75. The method according to claim 74, 
wherein the genetically constant region is part of the 
25 genome of immunoglobulin genes selected from the group 
of IgM, IgG, IgA, IgE or IgD. 
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wherein 
acids . 

5 wherein 

78. A vector comprising: 

(i) a DNA sequence encoding an antibody 
variable region linked to a version of PHI 
anchor which does not mediate infection of 

10 phage particles; and 

(ii) wild-type gene III. 

79. The vector according to claim 78, 
wherein the DNA encodes a Fab. 

80. The vector according to claim 78, 
15 wherein the DNA encodes heavy chain VHCH1. 

81. The vector according to claim 80, 
wherein the heavy chain VHCHl is linked to trpIII. 

82. The vector according to claim 78, 
wherein the DNA encodes light chain VLCL. 

20 83. The vector according to claim 82, 

wherein the light chain VLCL is linked, to trpIII. 

84. The vector according to claim 78, 
wherein the DNA encodes scFv. 



76. The method according to claim 13, 
the constant region is exogenous to the nucleic 

77. The methods according to claim 61, 
the amplification step uses geneRACE™. 
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85. The vector according to claim 84 , 
wherein the scFv is VL-VH. — 

86. The vector according to claim 84, 
wherein the scFv is VH-VL. 

5 87. The vector according to claim 78, 

wherein the DNA sequence encoding an antibody variable 
region linked to a version of PHI anchor further 
comprises an inducible promoter. 

88. The vector according to claim 87, 

10 wherein the inducible promoter regulates expression of 
the DNA sequence encoding an antibody variable region 
linked to a version of PHI anchor. 

89. The vector according to claim 78, 
wherein the DNA sequence encoding an antibody variable 

15 region linked to a version of PHI anchor further 
comprises an amber stop codon. 

90. The vector according to claim 89, 
wherein the DNA encoding the amber stop codon is 
located between the antibody variable region and the 

20 version of pill. 

91. The vector according to any one of 
claims 78 to 90 wherein the vector is phage or 
phagemid. 

92. A method for producing a population of 
25 immunoglobulin genes that comprises steps of: 
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(i) introducing synthetic diversity 
into at least one of CDR1 or CDR2 of 
those genes; and 

(ii) combining the diversity from 
5 step (i) with CDR3 diversity captured 

from B cells, 

93. The method according to claim 92, 
wherein synthetic diversity is introduced into both 
CDR1 and CDR2 . 

10 94. A method for producing a library of 

immunoglobulin genes that comprises 

(i) introducing synthetic diversity 
into at least one of CDR1 or CDR2 of 
those genes; and 
15 (ii) combining the diversity from 

step (i) with CDR3 diversity captured 
from B cells. 

95. The method according to claim 94, 
wherein synthetic diversity is introduced into both 
20 CDR1 and CDR2. 



96. A library of immunoglobulins that 
comprise members with at least one variable domain in 
which at least one of CDR1 and CDR2 contain synthetic 
diversity and CDR3 diversity is captured from B cells. 

25 97. A library according to claim 96, where 

both CDR1 and CDR2 contain synthetic diversity. 
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98. The vector according to claim 78 , 
wherein the version of PHI anchor is characterized by 
a. wild type amino acid sequence and is encoded by a 
non-wild type degenerate DNA sequence to a very high 
5 extent . 



99. In a method for displaying a member of a 
diverse family of peptides, polypeptides or proteins on 
the surface of a genetic package and collectively 
displaying at least a part of the diversity of the 
10 family, the improvement being characterized in that the 
displayed peptide, polypeptide or protein is encoded by 
a DNA sequence comprising a nucleic acid that has been 
cleaved at a desired location by 

(i) contacting the nucleic acid with a 
15 partially double-stranded oligonucleotide, 

the single-stranded region of the 
oligonucleotide being functionally 
complementary to the nucleic acid at its 5' 
terminal and 

20 (ii) cleaving the nucleic acid solely at 

a restriction endonuclease cleavage site 
located in the double-stranded region of the 
oligonucleotide or amplifying the nucleic 
acid using a primer at least in part 

25 functionally complementary to at least a part 

of the double-stranded region of the 
oligonucleotide, the primer also introducing 
on amplification an endonuclease cleavage 
site and cleaving the amplified nucleic acid 

30 sequence solely at that site; 
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the contacting and the cleaving steps being performed 
at a temperature sufficient to maintain the nucleic 
acid in substantially single-stranded form, the 
oligonucleotide being functionally complementary to the 
5 nucleic acid over a large enough region to allow the 
two strands to associate such that cleavage may occur 
at the chosen temperature and at the desired location, 
and the cleavage being carried out using a restriction 
endonuclease that is active at the chosen temperature. 

10 100. A method for displaying a member of a 

diverse family of peptides, polypeptides or proteins on 
the surface of a genetic package and collectively 
displaying at least a portion of the diversity of the 
family, the method comprising the steps of: 

15 (i) preparing a collection of nucleic acids 

that code, at least in part, for members of the diverse 
family; 

(ii) rendering the nucleic acids single- 
stranded; 

20 (iii) cleaving the single-stranded nucleic 

acids at a desired location by a method comprising the 
steps of: 

(a) contacting the nucleic acid with a 
partially double-stranded oligonucleotide, 

25 the single-stranded region of the 

oligonucleotide being functionally 
complementary to the nucleic acid at its 5' 
terminal region; and 

(b) cleaving the nucleic acid solely at 
30 a restriction endonuclease cleavage site 

located in the double-stranded region of the 
oligonucleotide or amplifying the nucleic 
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acid using a primer at least in part 
functionally complementary to at least a part 
of the double-stranded region of the 
oligonucleotide, the primer also introducing 
5 on amplification an endonuclease cleavage 

site and cleaving the amplified nucleic acid 
sequence solely at that site; 

the contacting and the cleaving steps being 
performed at a temperature sufficient to maintain 
the nucleic acid in substantially single-stranded 
form, the oligonucleotide being functionally 
complementary to the nucleic acid over a large 
enough region to allow the two strands to 
associate such that cleavage may occur at the 
chosen temperature and at the desired location, 
and the restriction being carried out using a 
cleavage endonuclease that is active at the chosen 
temperature; and 

(iv) displaying a member of the family of 
peptides, polypeptides or proteins coded, at least in 
part, by the cleaved nucleic acids on the surface of 
the genetic package and collectively displaying at 
least a portion of the diversity of the family. 

101, In a method for expressing a member of a 
25 diverse family of peptides, polypeptides or proteins 
and collectively expressing at least a part of the 
diversity of the family, the improvement being 
characterized in that the expressed peptide, 
polypeptide or protein is encoded by a DNA sequence 
30 comprising a nucleic acid that has been cleaved at a 
desired location by 



10 



15 



20 



WO 02/083872 



PC1YUS02/12405 



- 256 - 

(i) contacting the nucleic acid with a 
partially double-stranded oligonucleotide, 
the single-stranded region of the 
oligonucleotide being functionally 

5 complementary to the nucleic acid at its 5' 

terminal region; and 

(ii) cleaving the nucleic acid solely at 
the restriction endonuclease cleavage site 
located in the double-stranded region of the 

10 oligonucleotide or amplifying the nucleic 

acid using a primer at least in part 
functionally complementary to at least a part 
of the double-stranded region of the 
oligonucleotide, the primer also introducing 

15 on amplification an endonuclease cleavage 

site and cleaving the amplified nucleic acid 
sequence solely at that site; 

the contacting and the cleaving steps being performed 
at a temperature sufficient to maintain the nucleic 

20 acid in substantially single-stranded form, the 

oligonucleotide being functionally complementary to the 
nucleic acid over a large enough region to allow the 
two strands to associate such that cleavage may occur 
at the chosen temperature and at the desired location, 

25 and the cleavage being carried out using a restriction 
endonuclease that is active at the chosen temperature. 

102. A method for expressing a member of a 
diverse family of peptides, polypeptides or proteins 
and collectively expressing at least a portion of the 
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diversity of the family, the method comprising the 
steps of: 

(i) preparing a collection of nucleic acids 
that code, at least in part, for members of the diverse 
5 family; 

<ii) rendering the nucleic acids single- 
stranded; 

(iii) cleaving the single-stranded nucleic 
acids at a desired location by a method comprising the 
10 steps of: 

(a) contacting the nucleic acid with a 
. partially double-stranded oligonucleotide, 

the single-stranded region of the 
oligonucleotide being functionally 
15 complementary to the nucleic acid at its 5* 

terminal region; and 

(b) cleaving the nucleic acid solely at 
a restriction endonuclease cleavage site 
located in the double-stranded region of the 

20 nucleotide; or amplifying the nucleic acid 

using a primer at least in part functionally 
complementary to at least a part of the 
double-stranded region of the 
oligonucleotide, the primer also introducing 

25 on amplification an endonuclease cleavage 

site and cleaving the amplified nucleic acid 
sequence solely at that site; 

the contacting and the cleaving steps being 
performed at a temperature sufficient to maintain 
30 the nucleic acid in substantially single-stranded 

form, the oligonucleotide being functionally 
complementary to the nucleic acid over a large 
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enough region to allow the two strands to 
associate such that cleavage may occur at~the 
chosen temperature and at the desired location, 
and the restriction being carried out using a 
5 cleavage endonuclease that is active at the chosen 

temperature; and 

(iv) expressing a member of the family of 
peptides, polypeptides or proteins coded, at least in 
part, by the cleaved nucleic acids and collectively 
10 expressing at least a portion of the diversity of the 
family. 

103. A method for preparing a library 
comprising a collection of genetic packages that 
display a member of a diverse family of peptides, 
15 polypeptides or proteins and that collectively display 
at least a portion of the family comprising the steps: 
(i) preparing a collection of nucleic acids 
that code at least in part for members of the diverse 
family; 

20 (ii) rendering the nucleic acids single- 

stranded; 

(iii) cleaving the single-stranded nucleic 
acids at a desired location by a method comprising the 
steps of: 

25 (a) contacting the nucleic acid with a 

single-stranded oligonucleotide, the 
oligonucleotide being functionally 
complementary to the nucleic acid in the 
region in which cleavage is desired and 

30 including a sequence that with its complement 

in the nucleic acid forms a restriction 
endonuclease recognition site that on 
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restriction results in cleavage of the 
nucleic acid at the desired location-; and 

(b) cleaving the nucleic acid solely at 
the recognition site formed by the 
5 complementation of the nucleic acid and the 

oligonucleotide ; 
the contacting and the cleaving steps being 
performed at a temperature sufficient to maintain 
the nucleic acid in substantially single-stranded 
10 form, the oligonucleotide being functionally 

complementary to the nucleic acid over a large 
enough region to allow the two strands to 
associate such that cleavage may occur at the 
chosen temperature and at the desired location, 
15 and the cleavage being carried out using a 

restriction endonuclease that is active at the 
chosen temperature; 

<iv) contacting the nucleic acid with a 
partially double-stranded oligonucleotide, the single- 
20 stranded region of the oligonucleotide being 

functionally complementary to the nucleic acids in the 
5' terminal region that remains after the cleavage in 
step (iii) has been effected, and the double-stranded 
region of the oligonucleotide including any sequences 
25 necessary to return the sequences that remain after 
cleavage into proper and original reading frame for 
display; and 

(v) cleaving the nucleic acid solely at a 
restriction endonuclease cleavage site contained within 
30 the double-stranded region of the partially double- 
stranded oligonucleotide, the site being different from 
that used in step (iii) or amplifying the nucleic acid 
using a primer at least in part functionally 
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complementary to at least a part of the double-stranded 
region of the oligonucleotide, the primer also- 
introducing on amplification an endo.nuclease cleavage 
site and cleaving the amplified nucleic acid sequence 
5 solely at that site; 

the contacting and the cleaving steps being 
performed at a temperature sufficient to maintain 
the nucleic acid in substantially single-stranded 
form, the oligonucleotide being functionally 
10 complementary to the nucleic acid over a large 

enough region to allow the two strands to 
associate such that cleavage may occur at the 
chosen temperature and at the desired location, 
and the restriction being carried out using a 
15 cleavage endonuclease that is active at the chosen 

temperature; and 

(vi) displaying a member of the family of 
peptides, polypeptides or proteins coded, at least in 
part, by the cleaved nucleic acids on the surface of 
20 the genetic package and collectively displaying at 
least a portion of the diversity of the family. 

104. A method for preparing a library 
comprising a collection of members of a diverse family 
of peptides, polypeptides or proteins and collectively 
25 comprising at least a portion of the family comprising 
the steps: 

(i) preparing a collection of nucleic acids 
that code at least in part for members of the diverse 
family; 

30 (ii) rendering the nucleic acids single- 

stranded; 
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(iii) cleaving the single-stranded nucleic 
acids at a desired location by a method comprising the 
steps of: 

(a) contacting the nucleic acid with a 
5 single-stranded oligonucleotide, the 

oligonucleotide being functionally 
complementary to the nucleic acid in the 
region in which cleavage is desired and 
including a sequence that with its complement 
10 in the nucleic acid forms a restriction 

endonuclease recognition site that on 
restriction results in cleavage of the 
nucleic acid at the desired location; and 

<b) cleaving the nucleic acid solely at 
15 the recognition site formed by the 

complementation of the nucleic acid and the 
oligonucleotide; 
the contacting and the cleaving steps being 
performed at a temperature sufficient to maintain 
20 the nucleic acid in substantially single-stranded 

form, the oligonucleotide being functionally 
complementary to the nucleic acid over a large 
enough region to allow the two strands to 
associate such that cleavage may occur at the 
25 chosen temperature and at the desired location, 

and the cleavage being carried out using a 
restriction endonuclease that is active at the 
chosen temperature; 

(iv) contacting the nucleic acid with a 

30 partially double-stranded oligonucleotide, the single- 
stranded region of the oligonucleotide being 
functionally complementary to the nucleic acids in the 
5' terminal region that remains after the cleavage in 
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step (iii) has been effected, and the double-stranded 
region of the oligonucleotide including any sequence 
necessary to return the sequences that remain after 
cleavage into proper and original reading frame for 
5 expression; and 

(v) cleaving the nucleic acid solely at a 
restriction endonuclease cleavage site contained within 
the double-stranded region of the partially double- 
stranded oligonucleotide, the site being different from 

10 that used in step (iii) or amplifying the nucleic acid 
using a primer at least in part functionally 
complementary to at least a part of the double-stranded 
region of the oligonucleotide, the primer introducing 
on amplification an endonuclease cleavage site and 

15 cleaving the amplified nucleic acid sequence solely at 
that site; 

the contacting and the cleaving steps being 
performed at a temperature sufficient to maintain 
the nucleic acid in substantially single-stranded 

20 form, the oligonucleotide being functionally 

complementary to the nucleic acid over a large 
enough region to allow the two strands to 
associate such that cleavage may occur at the 
chosen temperature and at the desired location, 

25 and the restriction being carried out using a 

cleavage endonuclease that is active at the chosen 
temperature; and 

(vi) expressing a member of the family of 
peptides, polypeptides or proteins coded, at least in 

30 part, by the cleaved nucleic acids and collectively 
expressing at least a portion of the diversity of the 
family. 
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105. A library of immunoglobins comprising 
members having at least one variable domain irrwhich 
one or both of the CDR 1 and CDR 2 have synthetic 
diversity and the CDR 3 has diversity captured from 

5 B-Cells. 

106. The library according to claim 104, 
wherein a first variable domain has synthetic diversity 
in CDR 1 and CDR 2 and has diversity in CDR 3 captured 
from B-cells and a second variable domain has diversity 

10 captured from B-cells. 

107. The library according to claim 104 or 
105, wherein the variable domain is selected from the 
group of VH or VL. 

108. A method for cleaving a nucleic acid 
sequence at a desired location, the method comprising 
the steps of: 

<i) contacting a single-stranded nucleic 
acid sequence with a partially double- 
stranded oligonucleotide, the single-stranded 
region of the oligonucleotide being 
functionally complementary to the 5' terminal 
region of the nucleic acid sequence, the 
double-stranded region of the oligonucleotide 
including any sequences necessary to return 
the sequence in the single-stranded nucleic 
acid sequence into proper and original 
reading frame for expression; and 

(ii) cleaving the partially double- 
stranded oligonucleotide-s ingle-stranded 
nucleic acid combination solely at a 
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restriction endonuclease cleavage site 
contained within the double-stranded- 
oligonucleotide or amplifying the combination 
using a primer at least in part functionally 
5 complementary to at least part of the double- 

stranded region of the oligonucleotide , the 
primer introducing during amplification an 
endonuclease cleavage site and cleaving the 
amplified sequence solely at the site. 



10 109. The method according to claim 108, 

wherein the length of the single-stranded portion of 
the partially double-stranded oligonucleotide is 
between 2 and 15 bases. 

110. The method according to claim 109, 

15 wherein the length of the single-stranded portion of 
the partially double-stranded oligonucleotide is 
between 7 and 10 bases. 

111. The method according to claim 108, 
wherein the length of the double-stranded portion of 

20 the partially double-stranded oligonucleotide is 
between 12 and 100 base pairs. 

112. The method according to claim 111, 
wherein the length of the double-stranded portion of 
the partially double-stranded oligonucleotide is 

25 between 20 and 100 base pairs. 



113. The methods according to any one of 
claims 99 to 104 and 108, further comprising at least 
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one nucleic acid amplification step between one or more 
of steps (i) and (ii), steps (ii) and (iii), steps 
(iii) and (iv) and steps (iv) and (v) . 

114. A library comprising a collection of 
5 genetic packages that display a member of a diverse 
family of peptides, polypeptides or proteins and 
collectively display at least a portion of the 
diversity of the family, the library being produced 
using the methods of claims 99, 100, 103 or 113. 



10 115. A library, comprising a collection of 

members of a diverse family of peptides, polypeptides 
or proteins and collectively comprise at least a 
portion of the diversity of the family, the library 
being produced using the methods of claims 101, 102, 

15 104 or 113. 



116. The methods and libraries according to 
any one of claims 99 to 104 or 113, wherein the members 
of the library encode immunoglobulins. 
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SEQUENCE LISTING 

<110> LADNER, ROBERT C. 
COHEN, EDWARD H. 

NASTRI, HORACIO G. _ 
ROOKEY, KRISTIN L. 
HOET, RENE 

HOOGENBOOM, HENDRICUS R. J. M. 

<120> NOVEL METHODS OF CONSTRUCTING LIBRARIES COMPRISING 

DISPLAYED AND/OR EXPRESSED MEMBERS OF A DIVERSE FAMILY 
OF PEPTIDES, POLYPEPTIDES OR PROTEINS AND THE NOVEL 
LIBRARIES 

<130> DYAX/002 CIP2 

<140> 10/045,674 
<141> 2001-10-25 

<150> 06/198,069 
<151> 2000-04-17 

<150> 09/837,306 
<151> 2001-04-17 

<160> 635 

<170> Patentln Ver. 2.1 

<210> 1 
<211> 17 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 1 

catgtgtatt actgtgc 17 



<210> 2 
<211> 44 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 2 

cacatccgtg cttcttgcac ggatgtggca cagtaataca catg 4 4 



<210> 
<211> 



3 

18 
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2 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 3 

gtgtattaga ctgctgcc 



<210> 4 
<211> 43 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 4 

ggcagcagtc taatacacca catccgtgtt cttcacggat gtg 

<210> 5 

<211> 47 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 5 

cacatccgtg tttgttacac ggatgtggtg tcttacagtc cattctg 

<210> 6 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 6 

cagaatggac tgtaagacac 

<210> 7 
<211> 43 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 
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<400> 7 

atcgagtctc actgagccac atccgtggtt ttccacggat gtg 



<210> 8 
<211> 17 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 8 

gctcagtgag actcgat 



<210> 9 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> raodified_base 
<222> (10) . . (24) 

<223> A, T, C, G, other or unknown 
<400> 9 

cacgaggagn nnnnnnnnnn nnnn 

<210> 10 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 10 

atgaccgaat tgctacaag 

<210> 11 
<211> 46 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 
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<400> 11 

gactcctcag cttcttgctg aggagtcctt gtagcaattc ggtcat 

<210> 12 
<211> 6 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: 6 His tag 
<400> 12 

His His His His His His 
1 5 



<210> 13 
<211> 10 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modified_base 
<222> (6) . . (10) 

<223> A, T, C, G, other or unknown 

<400> 13 
gtctcnnnnn 

<210> 14 
<211> 11 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modified_base 
<222> (1) . . (6) 

<223> A, T, C, G, other or unknown 

<400> 14 
nnnnnngaga c 



<210> 15 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modif ied_base 
<222> (11) . . (24) 

<223> A, T, C, G, other or unknown 
<400> 15 

cacggatgtg nnnnnnnnnn nnnn 

<210> 16 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modif ied_base 
<222> (1)..{14) 

<223> A, T, C, G, other or unknown 
<400> 16 

nnnnnnnnnn nnnncacatc cgtg 

<210> 17 
<211> 14 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 17 
gtgtattact gtgc 



<210> 18 
<211> 34 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 18 

cacatccgtg cacggatgtg gcacagtaat acac 
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<210> 19 
<211> 14 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 19 
gtgtattaga ctgc 

<210> 20 

<211> 34 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 20 

gcagtctaat acaccacatc cgtgcacgga tgtg 

<210> 21 
<211> 34 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 21 

cacatccgtg cacggatgtg gtgtcttaca gtcc 

<210> 22 
<211> 14 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 22 
ggactgtaag acac 

<210> 23 
<211> 34 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 23 

gagtctcact gagccacatc cgtgcacgga tgtg 



<210> 24 
<211> 14 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 24 
gctcagtgag actc 

<210> 25 
<211> 14 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 25 
gtgtattact gtgc 



<210> 26 
<211> 14 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 26 
gtatattact gtgc 



<210> 27 
<211> 14 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 27 
gtgtattact gtaa 
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<210> 28 
<211> 14 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 28 
gtgtattact gtac 



<210> 29 
<211> 14 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 29 
ttgtattact gtgc 



<210> 30 
<211> 14 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 30 
ttgtatcact gtgc 



<210> 31 
<211> 14 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 31 
acatattact gtgc 



<210> 32 
<211> 14 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 



<400> 32 
acgtattact gtgc 



14 



<210> 33 
<211> 14 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 33 

atgtattact gtgc 14 



<210> 34 
<211> 101 
<212> DNA 

<213> Homo sapiens 
<400> 34 

agggtcacca tgaccaggga cacgtccatc agcacagcct acatgabcga gctgagcagg 60 
ctgagatctg acgacacggc cgtgtattac tgtgcgagag a 101 

<210> 35 
<211> 98 
<212> DNA 

<213> Homo sapiens 



<210> 36 

<211> 98 

<212> DNA 

<213> Homo sapiens 

<400> 36 

agagtcacca tgaccaggaa cacctccata agcacagcct acatggagct gagcagcctg 60 
agatctgagg acacggccgt gtattactgt gcgagagg 98 

<210> 37 

<211> 98 

<212> DNA 

<213> Homo sapiens 



<400> 35 

agagtcacca 

agatctgaag 



ttaccaggga cacatccgcg agcacagcct acatggagct gagcagcctg 60 
acacggctgt gtattactgt gcgagaga 98 
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<400> 37 

agagtcacca tgaccacaga cacatccacg agcacagcct acatggagct gaggagcctg 60 
agatctgacg acacggccgt gtattactgt gcgagaga 98 

<210> 38 — ■ 

<211> 98 

<212> DNA 

<213> Homo sapiens 

<400> 38 

agagtcacca tgaccgagga cacatctaca gacacagcct acatggagct gagcagcctg 60 
agatctgagg acacggccgt gtattactgt gcaacaga 98 

<210> 39 

<211> 98 

<212> DNA 

<213> Homo sapiens 

<400> 39 

agagtcacca ttaccaggga caggtctatg agcacagcct acatggagct gagcagcctg 60 
agatctgagg acacagccat gtattactgt gcaagata 98 

<210> 40 

<211> 98 

<212> DNA 

<213> Homo sapiens 

<400> 40 

agagtcacca tgaccaggga cacgtccacg agcacagtct acatggagct gagcagcctg 60 
agatctgagg acacggccgt gtattactgt gcgagaga 98 

<210> 41 

<211> 98 

<212> DNA 

<213> Homo sapiens 

<400> 41 

agagtcacca ttaccaggga catgtccaca agcacagcct acatggagct gagcagcctg 60 
agatccgagg acacggccgt gtattactgt gcggcaga 98 

<210> 42 

<211> 98 

<212> DNA 

<213> Homo sapiens 

<400> 42 

agagtcacga ttaccgcgga cgaatccacg agcacagcct acatggagct gagcagcctg 60 
agatctgagg acacggccgt gtattactgt gcgagaga 98 

<210> 43 
<211> 98 
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<212> DNA 

<213> Homo sapiens 
<400> 43 

agagtcacga ttaccgcgga caaatccacg agcacagcct acatggagct gagcagcctg 60 
agatctgagg acacggccgt gtattactgt gcgagaga 98 



<210> 44 
<211> 98 
<212> DNA 

<213> Homo sapiens 
<400> 44 

agagtcacca taaccgcgga cacgtctaca gacacagcct acatggagct gagcagcctg 60 
agatctgagg acacggccgt gtattactgt gcaacaga 98 



<210> 45 
<211> 100 
<212> DNA 

<213> Homo sapiens 
<400> 45 

aggctcacca tcaccaagga cacctccaaa aaccaggtgg tccttacaat gaccaacatg 60 
gaccctgtgg acacagccac atattactgt gcacacagac 100 



<210> 46 
<211> 100 
<212> DNA 

<213> Homo sapiens 
<400> 46 

aggctcacca tctccaagga cacctccaaa agccaggtgg tccttaccat gaccaacatg 60 
gaccctgtgg acacagccac atattactgt gcacggatac 100 



<210> 47 
<211> 100 
<212> DNA 

<213> Homo sapiens 
<400> 47 

aggctcacca tctccaagga cacctccaaa aaccaggtgg tccttacaat gaccaacatg 60 
gaccctgtgg acacagccac gtattactgt gcacggatac 100 



<210> 48 
<211> 98 
<212> DNA 

<213> Homo sapiens 
<400> 48 

cgattcacca tctccagaga caacgccaag aactcactgt atctgcaaat gaacagcctg 60 
agagccgagg acacggctgt gtattactgt gcgagaga 98 
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<210> 49 

<211> 100 

<212> DNA 

<213> Homo sapiens 

<400> 49 

cgattcacca tctccagaga caacgccaag aactccctgt atctgcaaat gaacagtctg 60 
agagctgagg acacggcctt gtattactgt gcaaaagata 100 



<210> 50 

<211> 98 

<212> DNA 

<213> Homo sapiens 

<400> 50 

cgattcacca tctccaggga caacgccaag aactcactgt atctgcaaat gaacagcctg 60 
agagccgagg acacggccgt gtattactgt gcgagaga 98 



<210> 51 

<211> 98 

<212> DNA 

<213> Homo sapiens 

<400> 51 

cgattcacca tctccagaga aaatgccaag aactccttgt atcttcaaat gaacagcctg 60 
agagccgggg acacggctgt gtattactgt gcaagaga 98 



<210> 52 

<211> 98 

<212> DNA 

<213> Homo sapiens 

<400> 52 

agattcacca tctcaagaga tgattcaaaa aacacgctgt atctgcaaat gaacagcctg 60 
aaaaccgagg acacagccgt gtattactgt accacaga 98 



<210> 53 

<211> 98 

<212> DNA 

<213> Homo sapiens 

<400> 53 

cgattcacca tctccagaga caacgccaag aactccctgt atctgcaaat gaacagtctg 60 
agagccgagg acacggcctt gtatcactgt gcgagaga 98 



<210> 54 

<211> 98 

<212> DNA 

<213> Homo sapiens 

<400> 54 

cgattcacca tctccagaga caacgccaag aactcactgt atctgcaaat gaacagcctg 60 
agagccgagg acacggctgt gtattactgt gcgagaga 98 
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<210> 55 
<211> 98 

<212> DNA _ 
<213> Homo sapiens 

<400> 55 

cggttcacca tctccagaga caattccaag aacacgctgt atctgcaaat gaacagcctg 60 
agagccgagg acacggccgt atattactgt gcgaaaga 98 



<210> 56 
<211> 98 
<212> DNA 

<213> Homo sapiens 
<400> 56 

cgattcacca tctccagaga caattccaag aacacgctgt atctgcaaat gaacagcctg 60 
agagctgagg acacggctgt gtattactgt gcgaaaga 98 



<210> 57 
<211> 98 
<212> DNA 

<213> Homo sapiens 
<400> 57 

cgattcacca tctccagaga caattccaag aacacgctgt atctgcaaat gaacagcctg 60 
agagctgagg acacggctgt gtattactgt gcgagaga 98 



<210> 58 
<211> 98 
<212> DNA 

<213> Homo sapiens 
<400> 58 

cgattcacca tctccagaga caattccaag aacacgctgt atctgcaaat gaacagcctg 60 
agagctgagg acacggctgt gtattactgt gcgaaaga 98 



<210> 59 
<211> 98 
<212> DNA 

<213> Homo sapiens 
<400> 59 

cgattcacca tctccagaga caattccaag aacacgctgt atctgcaaat gaacagcctg 60 
agagccgagg acacggctgt gtattactgt gcgagaga 98 



<210> 60 
<211> 100 
<212> DNA 

<213> Homo sapiens 
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<400> 60 

cgattcacca tctccagaga caacagcaaa aactccctgt atctgcaaat gaacagtctg 60 
agaactgagg acaccgcctt gtattactgt gcaaaagata 100 

<210> 61 

<211> 98 

<212> DNA 

<213> Homo sapiens 

<400> 61 

cgattcacca tctccagaga caatgccaag aactcactgt atctgcaaat gaacagcctg 60 
agagacgagg acacggctgt gtattactgt gcgagaga 98 



<210> 62 

<211> 98 

<212> DNA 

<213> Homo sapiens 

<400> 62 

agattcacca tctcaagaga tggttccaaa agcatcgcct atctgcaaat gaacagcctg 60 
aaaaccgagg acacagccgt gtattactgt actagaga 98 



<210> 63 

<211> 98 

<212> DNA 

<213> Homo sapiens 

<400> 63 

cgattcacca tctccagaga caattccaag aacacgctgt atcttcaaat gaacagcctg 60 
agagccgagg acacggccgt gtattactgt gcgagaga 98 



<210> 64 

<211> 98 

<212> DNA 

<213> Homo sapiens 

<400> 64 

agattcacca tctccagaga caattccaag aacacgctgt atcttcaaat gggcagcctg 60 
agagctgagg acatggctgt gtattactgt gcgagaga 98 



<210> 65 

<211> 98 

<212> DNA 

<213> Homo sapiens 

<400> 65 

agattcacca tctccagaga caattccaag aacacgctgt atcttcaaat gaacagcctg 60 
agagctgagg acacggctgt gtattactgt gcgagaga 98 



<210> 66 
<211> 98 
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<212> DNA 

<213> Homo sapiens 
<400> 66 

agattcacca tctcaagaga tgattcaaag aactcactgt atctgcaaat gaacagcctg 60 
aaaaccgagg acacggccgt gtattactgt gctagaga 98 



<210> 67 

<211> 98 

<212> DNA 

<213> Homo sapiens 

<400> 67 

aggttcacca tctccagaga tgattcaaag aacacggcgt atctgcaaat gaacagcctg 60 
aaaaccgagg acacggccgt gtattactgt actagaca 98 



<210> 68 

<211> 98 

<212> DNA 

<213> Homo sapiens 

<400> 68 

cgattcacca tctccagaga caacgccaag aacacgctgt atctgcaaat gaacagtctg 60 
agagccgagg acacggctgt gtattactgt gcaagaga 98 



<210> 69 

<211> 98 

<212> DNA 

<213> Homo sapiens 

<400> 69 

agattcacca tctccagaga caattccaag aacacgctgc atcttcaaat gaacagcctg 60 
agagctgagg acacggctgt gtattactgt aagaaaga 98 



<210> 70 

<211> 98 

<212> DNA 

<213> Homo sapiens 

<400> 70 

cgagtcacca tatcagtaga caagtccaag aaccagttct ccctgaagct gagctctgtg 60 
accgccgcgg acacggccgt gtattactgt gcgagaga 98 



<210> 71 

<211> 98 

<212> DNA 

<213> Homo sapiens 

<400> 71 

cgagtcacca tgtcagtaga cacgtccaag aaccagttct ccctgaagct gagctctgtg 60 
accgccgtgg acacggccgt gtattactgt gcgagaaa 98 
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<210> 72 

<211> 98 

<212> DNA 

<213> Homo sapiens 

<400> 72 

cgagttacca tatcagtaga cacgtctaag aaccagttct ccctgaagct gagctctgtg 60 
actgccgcgg acacggccgt gtattactgt gcgagaga 98 

<210> 73 
<211> 98 
<212> DNA 

<213> Homo sapiens 
<400> 73 

cgagtcacca tatcagtaga caggtccaag aaccagttct ccctgaagct gagctctgtg 60 
accgccgcgg acacggccgt gtattactgt gccagaga 98 



<210> 74 

<211> 98 

<212> DNA 

<213> Homo sapiens 

<400> 74 

cgagttacca tatcagtaga cacgtccaag aaccagttct ccctgaagct gagctctgtg 60 
actgccgcag acacggccgt gtattactgt gccagaga 98 



<210> 75 

<211> 98 

<212> DNA 

<213> Homo sapiens 

<400> 75 

cgagttacca tatcagtaga cacgtctaag aaccagttct ccctgaagct gagctctgtg 60 
actgccgcgg acacggccgt gtattactgt gcgagaga 98 



<210> 76 

<211> 98 

<212> DNA 

<213> Homo sapiens 

<400> 76 

cgagtcacca tatcagtaga cacgtccaag aaccagttct ccctgaagct gagctctgtg 60 
accgccgcgg acacggctgt gtattactgt gcgagaga 98 

<210> 77 

<211> 98 

<212> DNA 

<213> Homo sapiens 

<400> 77 

cgagtcacca tatccgtaga cacgtccaag aaccagttct ccctgaagct gagctctgtg 60 
accgccgcag acacggctgt gtattactgt gcgagaca 98 
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<210> 78 
<211> 98 

<212> DNA . . — 

<213> Homo sapiens 

<400> 78 

cgagtcacca tatcagtaga cacgtccaag aaccagttct ccctgaagct gagctctgtg 60 
accgctgcgg acacggccgt gtattactgt gcgagaga 98 



<210> 79 
<211> 98 
<212> DNA 

<213> Homo sapiens 
<400> 79 

cgagtcacca tatcagtaga cacgtccaag aaccagttct ccctgaagct gagctctgtg 60 
accgctgcgg acacggccgt gtattactgt gcgagaga 98 



<210> 80 
<211> 98 
<212> DNA 

<213> Homo sapiens 
<400> 80 

cgagtcacca tatcagtaga cacgtccaag aaccagttct ccctgaagct gagctctgtg 60 
accgccgcag acacggccgt gtattactgt gcgagaga 98 



<210> 81 
<211> 98 
<212> DNA 

<213> Homo sapiens 
<400> 81 

caggtcacca tctcagccga caagtccatc agcaccgcct acctgcagtg gagcagcctg 60 
aaggcctcgg acaccgccat gtattactgt gcgagaca 98 



<210> 82 
<211> 96 
<212> DNA 

<213> Homo sapiens 
<400> 82 

cacgtcacca tctcagctga caagtccatc agcactgcct acctgcagtg gagcagcctg 60 
aaggcctcgg acaccgccat gtattactgt gcgaga 96 



<210> 83 

<211> 98 

<212> DNA 

<213> Homo sapiens 



WO 02/083872 



PCT/US02/12405 



18 

<400> 83 

cgaataacca tcaacccaga cacatccaag aaccagttct ccctgcagct gaactctgtg 60 
actcccgagg acacggctgt gtattactgt gcaagaga 98 



<210> 84 

<211> 98 

<212> DNA 

<213> Homo sapiens 

<400> 84 

cggtttgtct tctccttgga cacctctgtc agcacggcat atctgcagat ctgcagccta 60 
aaggctgagg acactgccgt gtattactgt gcgagaga 98 



<210> 85 
<211> 11 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 



<220> 

<221> modif ied_base 
<222> (3).. (9) 

<223> A, T, C, G, other or unknown 

<400> 85 
gcnnnnnnng c 

<210> 86 
<211> 10 
<212> DNA 

<213> Artificial Sequence 

<220> 
<223> 



<220> 
<221> modifiedjbase 
<222> (4).. (7) 

<223> A, T, C, G, other or unknown 

<400> 86 
ca ynnnnrtg 



11 



10 



Description of Artificial Sequence: Synthetic 
oligonucleotide 



<210> 87 
<211> 11 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: Synthetic 
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oligonucleotide 

<220> 

<221> modif ied_base 

<222> (6).. (11) — • 

<223> A, T, C, G, other or unknown 

<400> 87 

gagtcnnnnn n 11 



<210> 88 
<211> 11 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modif ied_base 
<222> (1}..(6) 

<223> A, T, C, G, other or unknown 
<400> 88 

nnnnnngaga c 11 



<210> 89 
<211> 10 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modif ied_base 
<222> (4) . . (7) 

<223> A, T, C, G, other or unknown 
<400> 89 

gaannnnttc 10 



<210> 90 
<211> 90 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 3-23 
FR3 nucleotide sequence 
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<220> 
<221> CDS 
<222> (1)..{90) 

<220> 

<221> modified_base 
<222> (3) 

<223> A, T, C or G 
<220> 

<221> modifiedjbase 
<222> (9) 

<223> A, T, C or G 
<220> 

<221> modifiedjbase 

<222> (12) 

<223> A, T, C or G 

<220> 

<221> modifiedjbase 

<222> (21) 

<223> A, T, C or G 

<220> 

<221> modifiedjbase 

<222> (30) 

<223> A, T, C or G 

<220> 

<221> modifiedjbase 

<222> (36) 

<223> A, T, C or G 

<220> 

<221> modified_base 

<222> (51) 

<223> A, T, C or G 

<220> 

<221> modifiedjbase 

<222> (57) 

<223> A, T, C or G 

<220> 

<221> modifiedjbase 

<222> (60) 

<223> A, T, C or G 

<220> 

<221> modifiedjbase 

<222> (69) 

<223> A, T, C or G 

<220> 

<221> modifiedjbase 

<222> (72) 

<223> A, T, C or G 
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<220> 

<221> modif ied_base 

<222> (75) 

<223> A, T, C or G 

<220> 

<221> modif ied_base 

<222> (78) 

<223> A, T, C or G 

<220> 

<221> modif ied_base 

<222> (87) 

<223> A, T, C or G 

<400> 90 

acn ath wsn mgn gay aay wsn aar aay acn ytn tay ttn car atg aay 48 

Thr He Ser Arg Asp Asn Ser Lys Asn Thr Leu Tyr Leu Gin Met Asn 
15 10 15 

wsn ttr mgn gen gar gay acn gen gtn tay tay tgy gen aar 90 
Ser Leu Arg Ala Glu Asp Thr Ala Val Tyr Tyr Cys Ala Lys 
20 25 30 



<210> 91 
<211> 30 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 3-23 
FR3 protein sequence 

<400> 91 

Thr He Ser Arg Asp Asn Ser Lys Asn Thr Leu Tyr Leu Gin Met Asn 
15 10 15 

Ser Leu Arg Ala Glu Asp Thr Ala Val Tyr Tyr Cys Ala Lys 
20 25 30 



<210> 92 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
probe 



<400> 92 

agttctccct gcagctgaac tc 



22 
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<210> 93 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
probe 

<400> 93 

cactgtatct gcaaatgaac ag 

<210> 94 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
probe 

<400> 94 

ccctg.tatct gcaaatgaac ag _ 

<210> 95 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
probe 

<400> 95 

ccgcctacct gcagtggagc ag 

<210> 96 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
probe 

<400> 96 

cgctgtatct gcaaatgaac ag 

<210> 97 
<211> 22 
<212> DNA 

<213> Artificial Sequence 



<220> 
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<223> Description of Artificial Sequence: Synthetic 
probe 



<400> 97 

cggcatatct gcagatctgc ag 



22 



<210> 98 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
probe 

<400> 98 

cggcgtatct gcaaatgaac ag 22 



<210> 99 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 



<210> 100 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
probe 

<400> 100 

tcgcctatct gcaaatgaac ag 22 



<210> 101 
<211> 63 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 



probe 



<400> 99 

ctgcctacct gcagtggagc ag 



22 



<400> 101 

cgcttcacta agtctagaga caactctaag aatactctct acttgcagat gaacagctta 60 
agg 63 
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<210> 102 

<211> 45 

<212> DNA 

<213> Artificial Sequence — 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 102 

caagtagaga gtattcttag agttgtctct agacttagtg aagcg 45 

<210> 103 

<211> 54 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 103 

cgcttcacta agtctagaga caactctaag aatactctct acttgcagct gaac 54 

<210> 104 

<211> 54 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 104 

cgcttcacta agtctagaga caactctaag aatactctct acttgcaaat gaac 54 

<210> 105 
<211> 54 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 105 

cgcttcacta agtctagaga caactctaag aatactctct acttgcagtg gage 54 

<210> 106 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: Primer 
<400> 106 

cgcttcacta agtctagaga c 



<210> 107 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
probe 

<400> 107 

acatggagct gagcagcctg ag 



<210> 108 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
probe 

<400> 108 

acatggagct gagcaggctg ag 



<210> 109 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
probe 

<400> 109 

acatggagct gaggagcctg ag 

<210> 110 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
probe 

<400> 110 

acctgcagtg gagcagcctg aa 
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<210> 111 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
probe 

<400> 111 

atctgcaaat gaacagcctg aa 

<210> 112 

<211> 22 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
probe 

<400> 112 

atctgcaaat gaacagcctg ag 

<210> 113 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
probe 

<400> 113 

atctgcaaat gaacagtctg ag 

<210> 114 

<211> 22 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
probe 

<400> 114 

atctgcagat ctgcagccta aa 

<210> 115 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: Synthetic 
probe 

<400> 115 

atcttcaaat gaacagcctg ag 

<210> 116 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
probe 

<400> 116 

atcttcaaat gggcagcctg ag 



<210> 117 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
probe 

<400> 117 

ccctgaagct gagctctgtg ac 



<210> 118 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
probe 

<400> 118 

ccctgcagct gaactctgtg ac 



<210> 119 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
probe 
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<400> 119 

tccttacaat gaccaacatg ga 22 
<210> 120 

<211> 22 — ■ 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
probe 

<400> 120 

tccttaccat gaccaacatg ga 22 



<210> 121 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: . Synthetic 
oligonucleotide 

<400> 121 

acatggagct gagcagcctg ag 22 

<210> 122 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 122 

ccctgaagct gagctctgtg ac 22 



<210> 123 
<211> 54 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 123 

cgcttcacta agtctagaga caactctaag aatactctct acttgcagat gaac 54 



<210> 124 
<211> 60 
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<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 124 

cgcttcactc agtctagaga taacagtaaa aatactttgt acttgcagct gagcagcctg 60 



<210> 125 
<211> 60 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 125 

cgcttcactc agtctagaga taacagtaaa aatactttgt acttgcagct gagctctgtg 60 



<210> 126 
<211> 52 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 126 

tcagctgcaa gtacaaagta tttttactgt tatctctaga ctgagtgaag eg 52 



<210> 127 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 127 

cgcttcactc agtctagaga taac 24 



<210> 128 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 
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<400> 128 

ccgtgtatta ctgtgcgaga ga 

<210> 129 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 129 

ctgtgtatta ctgtgcgaga ga 



<210> 130 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 130 

ccgtgtatta ctgtgcgaga gg 



<210> 131 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: .Synthetic 
oligonucleotide 

<400> 131 

ccgtgtatta ctgtgcaaca ga 



<210> 132 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 132 

ccatgtatta ctgtgcaaga ta 
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<210> 133 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 133 

ccgtgtatta ctgtgcggca ga 



<210> 134 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 134 

ccacatatta ctgtgcacac ag 



<210> 135 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 135 

ccacatatta ctgtgcacgg at 



<210> 136 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 136 

ccacgtatta ctgtgcacgg at 



<210> 137 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 137 

ccttgtatta ctgtgcaaaa ga 

<210> 138 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 138 

ctgtgtatta ctgtgcaaga ga 

<210> 139 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 139 

ccgtgtatta ctgtaccaca ga 

<210> 140 

<211> 22 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 140 

ccttgtatca ctgtgcgaga ga 

<210> 141 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 



<400> 141 

ccgtatatta ctgtgcgaaa ga 
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<210> 142 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 142 

ctgtgtatta ctgtgcgaaa ga 



<210> 143 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 143 

ccgtgtatta ctgtactaga ga 



<210> 144 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 144 

ccgtgtatta ctgtgctaga ga 



<210> 145 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 145 

ccgtgtatta ctgtactaga ca 



<210> 146 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 146 

ctgtgtatta ctgtaagaaa ga 

<210> 147 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 147 

ccgtgtatta ctgtgcgaga aa 



<210> 148 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 148 

ccgtgtatta ctgtgccaga ga 

<210> 149 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 149 

ctgtgtatta ctgtgcgaga ca 

<210> 150 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 
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<400> 150 

ccatgtatta ctgtgcgaga ca 



<210> 151 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 151 

ccatgtatta ctgtgcgaga 



<210> 152 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 152 

ccgtgtatta ctgtgcgaga g 



<210> 153 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 153 

ctgtgtatta ctgtgcgaga g 



<210> 154 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 154 

ccgtgtatta ctgtgcgaga g 



<210> 155 
<211> 21 
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<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 155 

ccgtatatta ctgtgcgaaa g 

<210> 156 

<211> 21 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 156 

ctgtgtatta ctgtgcgaaa g 

<210> 157 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 157 

ctgtgtatta ctgtgcgaga c 

<210> 158 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 158 

ccatgtatta ctgtgcgaga c 

<210> 159 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 
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<400> 159 

ccatgtatta ctgtgcgaga 20 



<210> 160 
<211> 94 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 160 

ggtgtagtga tctagtgaca actctaagaa tactctctac ttgcagatga acagctttag 60 
ggctgaggac actgcagtct actattgtgc gaga 94 



<210> 161 
<211> 94 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 161 

ggtgtagtga tctagtgaca actctaagaa tactctctac ttgcagatga acagctttag 60 
ggctgaggac actgcagtct actattgtgc gaaa 94 

<210> 162 
<211> 85 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 162 

atagtagact gcagtgtcct cagcccttaa gctgttcatc tgcaagtaga gagtattctt 60 
agagttgtct ctagatcact acacc 85 



<210> 163 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 



<400> 163 

ggtgtagtga tctagagaca ac 



22 
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<210> 164 

<211> 55 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 164 

ggtgtagtga aacagcttta gggctgagga cactgcagtc tactattgtg cgaga 55 

<210> 165 

<211> 55 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 165 

ggtgtagtga aacagcttta gggctgagga cactgcagtc tactattgtg cgaaa 55 

<210> 166 

<211> 46 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 166 

atagtagact gcagtgtcct cagcccttaa gctgtttcac tacacc 4 6 

<210> 167 
<211> 46 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 167 

ggtgtagtga aacagcttaa gggctgagga cactgcagtc tactat 4 6 

<210> 168 
<211> 26 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence :. Synthetic 
oligonucleotide 

<400> 168 

ggtgtagtga aacagcttaa gggctg 

<210> 169 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
probe 

<400> 169 

agttctccct gcagctgaac tc 



<210> 170 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
probe 

<400> 170 

cactgtatct gcaaatgaac ag 

<210> 171 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
probe 

<400> 171 

ccctgtatct gcaaatgaac ag 

<210> 172 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
probe 
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<400> 172 

ccgcctacct gcagtggagc ag 

<210> 173 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
probe 

<400> 173 

cgctgtatct gcaaatgaac ag 

<210> 174 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence :_ Synthetic 
probe 

<400> 174 

cggcatatct gcagatctgc ag 

<210> 175 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
probe 

<400> 175 

cggcgtatct gcaaatgaac ag 

<210> 176 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
probe 

<400> 176 

ctgcctacct gcagtggagc ag 

<210> 177 
<211> 22 
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<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
probe 

<400> 177 

tcgcctatct gcaaatgaac ag 

<210> 178 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 178 

acatggagct gagcagcctg ag 

<210> 179 

<211> 22 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 179 

acatggagct gagcaggctg ag 

<210> 180 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 180 

acatggagct gaggagcctg ag 

<210> 181 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 
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<400> 181 

acctgcagtg gagcagcctg aa 

<210> 182 

<211> 22 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence; Synthetic 
oligonucleotide 

<400> 182 

atctgcaaat gaacagcctg aa 

<210> 183 

<211> 22 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 183 

atctgcaaat gaacagcctg ag 

<210> 184 

<211> 22 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 184 

atctgcaaat gaacagtctg ag 

<210> 185 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 185 

atctgcagat ctgcagccta aa 
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<210> 186 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oliqonucleot ide 

<400> 186 

atcttcaaat gaacagcctg ag 



<210> 187 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 187 

atcttcaaat gggcagcctg ag 



<210> 188 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 188 

ccctgaagct gagctctgtg ac 



<210> 189 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 18? 

ccctgcagct gaactctgtg ac 



<210> 190 

<211> 22 

<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 190 

tccttacaat gaccaacatg ga 

<210> 191 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 191 

tccttaccat gaccaacatg ga 

<210> 192 

<211> 22 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 192 

ccgtgtatta ctgtgcgaga ga 

<210> 193 

<211> 22 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 193 

ctgtgtatta ctgtgcgaga ga 

<210> 194 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 194 

ccgtgtatta ctgtgcgaga gg 
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<210> 195 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 195 

ccgtgtatta ctgtgcaaca ga 



<210> 196 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 196 

ccatgtatta ctgtgcaaga ta 



<210> 197 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 197 

ccgtgtatta ctgtgcggca ga 



<210> 198 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 198 

ccacatatta ctgtgcacac ag 



<210> 199 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 199 

ccacatatta ctgtgcacgg at 

<210> 200 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
' oligonucleotide 

<400> 200 

ccacgtatta ctgtgcacgg at 

<210> 201 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 201 

ccttgtatta ctgtgcaaaa ga 

<210> 202 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 202 

ctgtgtatta ctgtgcaaga ga 

<210> 203 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 
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<400> 203 

ccgtgtatta ctgtaccaca ga 



<210> 204 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 204 

ccttgtatca ctgtgcgaga ga 



<210> 205 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 205 

ccgtatatta ctgtgcgaaa ga 



<210> 206 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 206 

ctgtgtatta ctgtgcgaaa ga 



<210> 207 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 207 

ccgtgtatta ctgtactaga ga 



<210> 208 
<211> 22 
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<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 208 

ccgtgtatta ctgtgctaga ga 

<210> 209 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 209 

ccgtgtatta ctgtactaga ca 



<210> 210 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 210 

ctgtgtatta ctgtaagaaa ga 



<210> 211 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 211 

ccgtgtatta ctgtgcgaga aa 

<210> 212 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 
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<400> 212 

ccgtgtatta ctgtgccaga ga 



<210> 213 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 213 

ctgtgtatta ctgtgcgaga ca 



<210> 214 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 214 

ccatgtatta ctgtgcgaga ca 



<210> 215 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 215 

ccatgtatta ctgtgcgaga aa 



<210> 216 
<211> 90 
<212> DNA 

<213> Homo sapiens 
<400> 216 

caggtgcagc tggtgcagtc tggggctgag gtgaagaagc ctggggcctc agtgaaggtc 60 
tcctgcaagg cttctggata caccttcacc 90 



<210> 217 
<211> 90 
<212> DNA 

<213> Homo sapiens 
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<400> 217 

caggtccagc ttgtgcagtc tggggctgag gtgaagaagc ctggggcctc agtgaaggtt 60 
tcctgcaagg cttctggata caccttcact 90 



<210> 218 

<211> 90 

<212> DNA 

<213> Homo sapiens 



<400> 218 

caggtgcagc tggtgcagtc tggggctgag gtgaagaagc ctggggcctc agtgaaggtc 60 
tcctgcaagg cttctggata caccttcacc " 90 



<210> 219 

<211> 90 

<212> DNA 

<213> Homo sapiens 



<400> 219 

caggttcagc tggtgcagtc tggagctgag gtgaagaagc ctggggcctc agtgaaggtc 60 
tcctgcaagg cttctggtta cacctttacc 90 



<210> 220 

<211> 90 

<212> DNA 

<213> Homo sapiens 



<400> 220 

caggtccagc tggtacagtc tggggctgag gtgaagaagc ctggggcctc agtgaaggtc 60 
tcctgcaagg tttccggata caccctcact 90 



<210> 221 

<211> 90 

<212> DNA 

<213> Homo sapiens 

<400> 221 

cagatgcagc tggtgcagtc tggggctgag gtgaagaaga ctgggtcctc agtgaaggtt 60 
tcctgcaagg cttccggata caccttcacc 90 



<210> 222 

<211> 90 

<212> DNA 

<213> Homo sapiens 

<400> 222 

caggtgcagc tggtgcagtc tggggctgag gtgaagaagc ctggggcctc agtgaaggtt 60 
tcctgcaagg catctggata caccttcacc 90 



<210> 223 
<211> 90 
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<212> DNA 
<213> Homo 



sapiens 



<400> 223 

caaatgcagc 

tcctgcaagg 



tggtgcagtc tgggcctgag gtgaagaagc ctgggacctc agtgaaggtc 60 
cttctggatt cacctttact 90 



<210> 224 

<211> 90 

<212> DNA 

<213> Homo sapiens 

<400> 224 

caggtgcagc tggtgcagtc tggggctgag gtgaagaagc ctgggtcctc ggtgaaggtc 60 
tcctgcaagg cttctggagg caccttcagc 90 



<210> 225 

<211> 90 

<212> DNA 

<213> Homo sapiens 

<400> 225 

caggtgcagc tggtgcagtc tggggctgag gtgaagaagc ctgggtcctc ggtgaaggtc 60 
tcctgcaagg cttctggagg caccttcagc 90 



<210> 226 

<211> 90 

<212> DNA 

<213> Homo sapiens 

<400> 226 

gaggtccagc tggtacagtc tggggctgag gtgaagaagc ctggggctac agtgaaaatc 60 
tcctgcaagg tttctggata caccttcacc 90 



<210> 227 

<211> 90 

<212> DNA 

<213> Homo sapiens 

<400> 227 

cagatcacct tgaaggagtc tggtcctacg ctggtgaaac ccacacagac cctcacgctg 60 
acctgcacct tctctgggtt ctcactcagc 90 



<210> 228 

<211> 90 

<212> DNA 

<213> Homo sapiens 

<400> 228 

caggtcacct tgaaggagtc tggtcctgtg ctggtgaaac ccacagagac cctcacgctg 60 
acctgcaccg tctctgggtt ctcactcagc 90 
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<210> 229 
<211> 90 
<212> DNA 

<213> Homo sapiens 



<400> 229 — • 

caggtcacct tgaaggagtc tggtcctgcg ctggtgaaac ccacacagac cctcacactg 60 
acctgcacct tctctgggtt ctcactcagc 90 

<210> 230 
<211> 90 
<212> DNA 

<213> Homo sapiens 
<400> 230 

gaggtgcagc tggtggagtc tgggggaggc ttggtccagc ctggggggtc cctgagactc 60 
tcctgtgcag cctctggatt cacctttagt 90 



<210> 231 
<211> 90 
<212> DNA 

<213> Homo sapiens 



<400> 231 

gaagtgcagc tggtggagtc tgggggaggc ttggtacagc ctggcaggtc cctgagactc 60 
tcctgtgcag cctctggatt cacctttgat 90 



<210> 232 
<211> 90 
<212> DNA 

<213> Homo sapiens 



<400> 232 

caggtgcagc tggtggagtc tgggggaggc ttggtcaagc ctggagggtc cctgagactc 60 
tcctgtgcag cctctggatt caccttcagt 90 



<210> 233 
<211> 90 
<212> DNA 

<213> Homo sapiens 



<400> 233 

gaggtgcagc tggtggagtc tgggggaggc ttggtacagc ctggggggtc cctgagactc 60 
tcctgtgcag cctctggatt caccttcagt 90 

<210> 234 
<211> 90 
<212> DNA 

<213> Homo sapiens 



<400> 234 

gaggtgcagc tggtggagtc tgggggaggc ttggtaaagc ctggggggtc ccttagactc 60 
tcctgtgcag cctctggatt cactttcagt 90 
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<210> 235 
<211> 90 
<212> DNA 

<213> Homo sapiens 
<400> 235 

gaggtgcagc tggtggagtc tgggggaggt gtggtacggc ctggggggtc cctgagactc 60 
tcctgtgcag cctctggatt cacctttgat 90 



<210> 236 
<211> 90 
<212> DNA 

<213> Homo sapiens 
<400> 236 

gaggtgcagc tggtggagtc tgggggaggc ctggtcaagc ctggggggtc cctgagactc 60 
tcctgtgcag cctctggatt caccttcagt 90 



<210> 237 
<211> 90 
<212> DNA 

<213> Homo sapiens 
<400> 237 

gaggtgcagc tgttggagtc tgggggaggc ttggtacagc ctggggggtc cctgagactc 60 
tcctgtgcag cctctggatt cacctttagc 90 



<210> 238 
<211> 90 
<212> DNA 

<213> Homo sapiens 
<400> 238 

caggtgcagc tggtggagtc tgggggaggc gtggtccagc ctgggaggtc cctgagactc 60 
tcctgtgcag cctctggatt caccttcagt 90 



<210> 239 
<211> 90 
<212> DNA 

<213> Homo sapiens 
<400> 239 

caggtgcagc tggtggagtc tgggggaggc gtggtccagc ctgggaggtc cctgagactc 60 
tcctgtgcag cctctggatt caccttcagt 90 



<210> 240 
<211> 90 
<212> DNA 

<213> Homo sapiens 



WO 02/083872 



PCT/US02/12405 



54 

<400> 240 

caggtgcagc tggtggagtc tgggggaggc gtggtccagc ctgggaggtc cctgagactc 60 
tcctgtgcag cctctggatt caccttcagt ' 90 



<210> 241 — ■ 

<211> 90 

<212> DNA 

<213> Homo sapiens 

<400> 241 

caggtgcagc tggtggagtc tgggggaggc gtggtccagc ctgggaggtc cctgagactc 60 
tcctgtgcag cgtctggatt caccttcagt 90 



<210> 242 

<211> 90 

<212> DNA 

<213> Homo sapiens 

<400> 242 

gaagtgcagc tggtggagtc tgggggagtc gtggtacagc ctggggggtc cctgagactc 60 
tcctgtgcag cctctggatt cacctttgat 90 



<210> 243 

<211> 90 

<212> DNA 

<213> Homo sapiens 

<400> 243 

gaggtgcagc tggtggagtc tgggggaggc ttggtacagc ctggggggtc cctgagactc 60 
tcctgtgcag cctctggatt caccttcagt " 90 

<210> 244 

<211> 90 

<212> DNA 

<213> Homo sapiens 

<400> 244 

gaggtgcagc tggtggagtc tgggggaggc ttggtacagc cagggcggtc cctgagactc 60 
tcctgtacag cttctggatt cacctttggt — - ^ 



<210> 245 

<211> 90 

<212> DNA 

<213> Homo sapiens 

<400> 245 

gaggtgcagc tggtggagac tggaggaggc ttgatccagc ctggggggtc cctgagactc 60 
tcctgtgcag cctctgggtt caccgtcagt 90 



<210> 246 
<211> 90 
<212> DNA 
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<213> Homo sapiens 
<400> 246 

gaggtgcagc tggtggagtc tgggggaggc ttggtccagc ctggggggtc cctgagactc 60 
tcctgtgcag cctctggatt caccttcagt 90 



<210> 247 

<211> 90 

<212> DNA 

<213> Homo sapiens 

<400> 247 

gaggtgcagc tggtggagtc tgggggaggc ttggtccagc ctggggggtc cctgagactc 60 
tcctgtgcag cctctggatt caccgtcagt 90 



<210> 248 

<211> 90 

<212> DNA 

<213> Homo sapiens 

<400> 248 

gaggtgcagc tggtggagtc tgggggaggc ttggtccagc ctggagggtc cctgagactc 60 
tcctgtgcag cctctggatt caccttcagt 90 



<210> 249 

<211> 90 

<212> DNA 

<213> Homo sapiens 

<400> 249 

gaggtgcagc tggtggagtc tgggggaggc ttggtccagc ctggggggtc cctgaaactc 60 
tcctgtgcag cctctgggtt caccttcagt 90 



<210> 250 
<2U> 90 
<212> DNA 

<213> Homo sapiens 
<400> 250 

gaggtgcagc tggtggagtc cgggggaggc ttagttcagc ctggggggtc cctgagactc 60 
tcctgtgcag cctctggatt caccttcagt 90 



<210> 251 

<211> 90 

<212> DNA 

<213> Homo sapiens 



<400> 251 

gaggtgcagc tggtggagtc tcggggagtc ttggtacagc ctggggggtc cctgagactc 60 
tcctgtgcag cctctggatt caccgtcagt 90 
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<211> 90 

<212> DNA 

<213> Homo sapiens 

<400> 252 — ■ 

caggtgcagc tgcaggagtc gggcccagga ctggtgaagc cttcggggac cctgtccctc 60 
acctgcgctg tctctggtgg ctccatcagc 90 

<210> 253 

<211> 90 

<212> DNA 

<213> Homo sapiens 

<400> 253 

caggtgcagc tgcaggagtc gggcccagga ctggtgaagc cttcggacac cctgtccctc 60 
acctgcgctg tctctggtta ctccatcagc 90 



<210> 254 

<211> 90 

<212> DNA 

<213> Homo sapiens 

<400> 254 

caggtgcagc tgcaggagtc gggcccagga ctggtgaagc cttcacagac cctgtccctc 60 
acctgcactg tctctggtgg ctccatcagc 90 



<210> 255 

<211> 90 

<212> DNA 

<213> Homo sapiens 

<400> 255 

cagctgcagc tgcaggagtc cggctcagga ctggtgaagc cttcacagac cctgtccctc 60 
acctgcgctg tctctggtgg ctccatcagc 90 



<210> 256 

<211> 90 

<212> DNA 

<213> Homo sapiens 

<400> 256 

caggtgcagc tgcaggagtc gggcccagga ctggtgaagc cttcacagac cctgtccctc 60 
acctgcactg tctctggtgg ctccatcagc 90 

<210> 257 

<211> 90 

<212> DNA 

<213> Homo sapiens 

<400> 257 

caggtgcagc tgcaggagtc gggcccagga ctggtgaagc cttcacagac cctgtccctc 60 
acctgcactg tctctggtgg ctccatcagc 90 
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<210> 258 
<211> 90 
<212> DNA 

<213> Homo sapiens 
<400> 258 

caggtgcagc tacagcagtg gggcgcagga ctgttgaagc cttcggagac cctgtccctc 60 
acctgcgctg tctatggtgg gtccttcagt 90 



<210> 259 
<211> 90 
<212> DNA 

<213> Homo sapiens 
<400> 259 

cagctgcagc tgcaggagtc gggcccagga ctggtgaagc cttcggagac cctgtccctc 60 
acctgcactg tctctggtgg ctccatcagc 90 



<210> 260 
<211> 90 
<212> DNA 

<213> Homo sapiens 
<400> 260 

caggtgcagc tgcaggagtc gggcccagga ctggtgaagc cttcggagac cctgtccctc 60 
acctgcactg tctctggtgg ctccatcagt 90 



<210> 261 

<211> 90 

<212> DNA 

<213> Homo sapiens 

<400> 261 

caggtgcagc tgcaggagtc gggcccagga ctggtgaagc cttcggagac cctgtccctc 60 
acctgcactg tctctggtgg ctccgtcagc 90 



<210> 262 

<211> 90 

<212> DNA 

<213> Homo sapiens 

<400> 262 

caggtgcagc tgcaggagtc gggcccagga ctggtgaagc cttcggagac cctgtccctc 60 
acctgcgctg tctctggtta ctccatcagc 90 



<210> 263 

<211> 90 

<212> DNA 

<213> Homo sapiens 
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<400> 263 

gaggtgcagc tggtgcagtc tggagcagag gtgaaaaagc ccggggagtc tctgaagatc 60 
tcctgtaagg gttctggata cagctttacc 90 



<210> 264 

<211> 90 

<212> DNA 

<213> Homo sapiens 

<400> 264 

gaagtgcagc tggtgcagtc tggagcagag gtgaaaaagc ccggggagtc tctgaggatc 60 
tcctgtaagg gttctggata cagctttacc 90 

<210> 265 

<211> 90 

<212> DNA 

<213> Homo sapiens 

<400> 265 

caggtacagc tgcagcagtc aggtccagga ctggtgaagc cctcgcagac cctctcactc 60 
acctgtgcca tctccgggga cagtgtctct 90 

<210> 266 

<211> 90 

<212> DNA 

<213> Homo sapiens 

<400> 266 

caggtgcagc tggtgcaatc tgggtctgag ttgaagaagc ctggggcctc agtgaaggtt 60 
tcctgcaagg cttctggata caccttcact 90 



<210> 267 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 267 

ccgtgtatta ctgtgcgaga ga 22 



<210> 268 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 
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<400> 268 

ctgtgtatta ctgtgcgaga ga 



<210> 269 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 269 

ccgtgtatta ctgtgcgaga gg 



<210> 270 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 270 

ccgtatatta ctgtgcgaaa ga 



<210> 271 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 271 

ctgtgtatta ctgtgcgaaa ga 

<210> 272 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 272 

ctgtgtatta ctgtgcgaga ca 



<210> 273 
<211> 22 
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<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 273 

ccatgtatta ctgtgcgaga ca 22 

<210> 274 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 274 

ccatgtatta ctgtgcgaga aa 22 



<210> 275 
<211> 69 
<212> DNA 

<213> Homo sapiens 
<400> 275 

gacatccaga tgacccagtc tccatcctcc ctgtctgcat ctgtaggaga cagagtcacc 60 
atcacttgc 69 

<210> 276 
<211> 69 
<212> DNA 

<213> Homo sapiens 
<400> 276 

gacatccaga tgacccagtc tccatcctcc ctgtctgcat ctgtaggaga cagagtcacc 60 
atcacttgc 69 



<210> 277 
<211> 69 
<212> DNA 

<213> Homo sapiens 
<400> 277 

gacatccaga tgacccagtc tccatcctcc ctgtctgcat ctgtaggaga cagagtcacc 60 
atcacttgc 69 



<210> 278 
<211> 69 
<212> DNA 

<213> Homo sapiens 
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<400> 278 

gacatccaga tgacccagtc tccatcctcc ctgtctgcat ctgtaggaga cagagtcacc 60 
atcacttgc 69 



<210> 279 
<211> 69 
<212> DNA 

<213> Homo sapiens 
<400> 279 

gacatccaga tgacccagtc tccatcctcc ctgtctgcat ctgtaggaga cagagtcacc 60 
atcacttgc 69 



<210> 280 
<211> 69 
<212> DNA 

<213> Homo sapiens 
<400> 280 

gacatccaga tgacccagtc tccatcctcc ctgtctgcat ctgtaggaga cagagtcacc 60 
atcacttgc 69 

<210> 281 
<211> 69 
<212> DNA 

<213> Homo sapiens 
<400> 281 

aacatccaga tgacccagtc tccatctgcc atgtctgcat ctgtaggaga cagagtcacc 60 
atcacttgt 69 



<210> 282 
<211> 69 
<212> DNA 

<213> Homo sapiens 
<400> 282 

gacatccaga tgacccagtc tccatcctca ctgtctgcat ctgtaggaga cagagtcacc 60 
atcacttgt 69 



<210> 283 
<211> 69 
<212> DNA 

<213> Homo sapiens 
<400> 283 

gacatccaga tgacccagtc tccatcctca ctgtctgcat ctgtaggaga cagagtcacc 60 
atcacttgt 69 



<210> 284 
<211> 69 
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<212> DNA 

<213> Homo sapiens 

<400> 284 

gccatccagt tgacccagtc tccatcctcc ctgtctgcat ctgtaggaga cagagtcacc 60 
atcacttgc 69 



<210> 285 

<211> 69 

<212> DNA 

<213> Homo sapiens 

<400> 285 

gccatccagt tgacccagtc tccatcctcc ctgtctgcat ctgtaggaga cagagtcacc 60 
atcacttgc 69 



<210> 286 

<211> 69 

<212> DNA 

<213> Homo sapiens 

<400> 286 

gacatccaga tgacccagtc tccatcttcc gtgtctgcat ctgtaggaga cagagtcacc 60 
atcacttgt 69 



<210> 287 

<211> 69 

<212> DNA 

<213> Homo sapiens 

<400> 287 

gacatccaga tgacccagtc tccatcttct gtgtctgcat ctgtaggaga cagagtcacc 60 
atcacttgt 69 



<210> 288 

<211> 69 

<212> DNA 

<213> Homo sapiens 

<400> 288 

gacatccagt tgacccagtc tccatccttc ctgtctgcat ctgtaggaga cagagtcacc 60 
atcacttgc 69 



<210> 289 

<211> 69 

<212> DNA 

<213> Homo sapiens 



<400> 289 

gccatccgga 

atcacttgc 



tgacccagtc tccattctcc ctgtctgcat ctgtaggaga cagagtcacc 60 

69 
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<210> 290 
<211> 69 
<212> DNA 

<213> Homo sapiens 



<400> 290 

gccatccgga tgacccagtc tccatcctca ttctctgcat ctacaggaga cagagtcacc 60 
atcacttgt 69 



<210> 291 
<211> 69 
<212> DNA 

<213> Homo sapiens 
<400> 291 

gtcatctgga tgacccagtc tccatcctta ctctctgcat ctacaggaga cagagtcacc 60 
atcagttgt 69 



<210> 292 

<211> 69 

<212> DNA 

<213> Homo sapiens 

<400> 292 

gccatccaga tgacccagtc tccatcctcc ctgtctgcat ctgtaggaga cagagtcacc 60 
atcacttgc 69 



<210> 293 
<211> 69 
<212> DNA 

<213> Homo sapiens 
<400> 293 

gacatccaga tgacccagtc tccttccacc ctgtctgcat ctgtaggaga cagagtcacc 60 
atcacttgc 69 



<210> 294 
<211> 69 
<212> DNA 

<213> Homo sapiens 



<400> 294 

gatattgtga tgacccagac tccactctcc ctgcccgtca cccctggaga gccggcctcc 60 
atctcctgc 69 



<210> 295 
<211> 69 
<212> DNA 

<213> Homo sapiens 



<400> 295 

gatattgtga tgacccagac tccactctcc ctgcccgtca cccctggaga gccggcctcc 60 
atctcctgc 69 
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<210> 296 

<211> 69 

<212> DNA 

<213> Homo sapiens 



<400> 296 

gatgttgtga tgactcagtc tccactctcc ctgcccgtca cccttggaca gccggcctcc 60 
atctcctgc 69 



<210> 297 

<211> 69 

<212> DNA 

<213> Homo sapiens 



<400> 297 

gatgttgtga tgactcagtc tccactctcc ctgcccgtca cccttggaca gccggcctcc 60 
atctcctgc 69 



<210> 298 

<211> 69 

<212> DNA 

<213> Homo sapiens 



<400> 298 

gatattgtga tgacccagac tccactctct ctgtccgtca cccctggaca gccggcctcc 60 
atctcctgc 69 



<210> 299 
<211> 69 
<212> DNA 

<213> Homo sapiens 



<400> 299 

gatattgtga tgacccagac tccactctct ctgtccgtca cccctggaca gccggcctcc 60 
atctcctgc 69 



<210> 300 
<211> 69 
<212> DNA 

<213> Homo sapiens 



<400> 300 

gatattgtga tgactcagtc tccactctcc ctgcccgtca cccctggaga gccggcctcc 60 
atctcctgc 69 



<210> 301 

<211> 69 

<212> DNA 

<213> Homo sapiens 
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<400> 301 

gatattgtga tgactcagtc tccactctcc ctgcccgtca cccctggaga gccggcctcc 60 
atctcctgc 69 



<210> 302 

<211> 69 

<212> DNA 

<213> Homo sapiens 



<400> 302 

gatattgtga tgacccagac tccactctcc tcacctgtca cccttggaca gccggcctcc 60 
atctcctgc 69 



<210> 303 

<211> 69 

<212> DNA 

<213> Homo sapiens 



<400> 303 

gaaattgtgt tgacgcagtc tccaggcacc ctgtctttgt ctccagggga aagagccacc 60 
ctctcctgc 69 



<210> 304 

<211> 69 

<212> DNA 

<213> Homo sapiens 



<400> 304 

gaaattgtgt tgacgcagtc tccagccacc ctgtctttgt ctccagggga aagagccacc 60 
ctctcctgc 69 



<210> 305 

<211> 69 

<212> DNA 

<213> Homo sapiens 



<400> 305 

gaaatagtga tgacgcagtc tccagccacc ctgtctgtgt ctccagggga aagagccacc 60 
ctctcctgc 69 



<210> 306 

<211> 69 

<212> DNA 

<213> Homo sapiens 

<400> 306 

gaaatagtga tgacgcagtc tccagccacc ctgtctgtgt ctccagggga aagagccacc 60 
ctctcctgc 69 



<210> 307 
<211> 69 
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<212> DNA 

<213> Homo sapiens 

<400> 307 

gaaattgtgt tgacacagtc tccagccacc ctgtctttgt ctccaggooa aaoagccacc 60 
ctctcctgc 69 



<210> 308 

<211> 69 

<212> DNA 

<213> Homo sapiens 



<400> 308 

gaaattgtgt tgacacagtc tccagccacc ctgtctttgt ctccagggga aagagccacc 60 
ctctcctgc 69 



<210> 309 

<211> 69 

<212> DNA 

<213> Homo sapiens 

<400> 309 

gaaattgtaa tgacacagtc tccagccacc ctgtctttgt ctccagggga aagagccacc 60 
ctctcctgc 69 



<210> 310 

<211> 69 

<212> DNA 

<213> Homo sapiens 

<400> 310 

gacatcgtga tgacccagtc tccagactcc ctggctgtgt ctctgggcga gagggccacc 60 
atcaactgc 69 



<210> 311 

<211> 69 

<212> DNA 

<213> Homo sapiens 



<400> 311 

gaaacgacac tcacgcagtc tccagcattc atgtcagcga ctccaggaga caaagtcaac 60 
atctcctgc 69 



<210> 312 

<211> 69 

<212> DNA 

<213> Homo sapiens 



<400> 312 

gaaattgtgc tgactcagtc tccagacttt cagtctgtga ctccaaagga gaaagtcacc 60 
atcacctgc 69 
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<210> 313 
<211> 69 
<212> DNA 

<213> Homo sapiens 

<400> 313 ~~ 
gaaattgtgc tgactcagtc tccagacttt cagtctgtga ctccaaagga gaaagtcacc 60 
atcacctgc 69 



<210> 314 
<211> 69 
<212> DNA 

<213> Homo sapiens 
<400> 314 

gatgttgtga tgacacagtc tccagctttc ctctctgtga ctccagggga gaaagtcacc 60 
atcacctgc 69 



<210> 315 
<211> 66 
<212> DNA 

<213> Homo sapiens 
<400> 315 

cagtctgtgc tgactcagcc accctcggtg tctgaagccc ccaggcagag ggtcaccatc 60 
tcctgt 66 



<210> 316 
<211> 66 
<212> DNA 

<213> Homo sapiens 
<400> 316 

cagtctgtgc tgacgcagcc gccctcagtg tctggggccc cagggcagag ggtcaccatc 60 
tcctgc 66 



<210> 317 
<211> 66 
<212> DNA 

<213> Homo sapiens 



<400> 317 

cagtctgtgc tgactcagcc accctcagcg tctgggaccc ccgggcagag ggtcaccatc 60 
tcttgt 66 



<210> 318 
<211> 66 
<212> DNA 

<213> Homo sapiens 



<400> 318 

cagtctgtgc tgactcagcc accctcagcg tctgggaccc ccgggcagag ggtcaccatc 60 
tcttgt 66 
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<210> 319 

<211> 66 

<212> DNA 

<213> Homo sapiens 



<400> 319 

cagtctgtgt tgacgcagcc gccctcagtg tctgcggccc caggacagaa ggtcaccatc 60 
tcctgc 66 



<210> 320 

<211> 66 

<212> DNA 

<213> Homo sapiens 



<400> 320 

cagtctgccc tgactcagcc tccctccgcg tccgggtctc ctggacagtc agtcaccatc 60 
tcctgc 66 



<210> 321 

<211> 66 

<212> DNA 

<213> Homo sapiens 



<400> 321 

cagtctgccc tgactcagcc tcgctcagtg tccgggtctc ctggacagtc agtcaccatc 60 
tcctgc 66 



<210> 322 

<211> 66 

<212> DNA 

<213> Homo sapiens 



<400> 322 

cagtctgccc tgactcagcc tgcctccgtg tctgggtctc ctggacagtc gatcaccatc 60 
tcctgc " 66 



<210> 323 

<211> 66 

<212> DNA 

<213> Homo sapiens 



<400> 323 

cagtctgccc tgactcagcc tccctccgtg tccgggtctc ctggacagtc agtcaccatc 60 
tcctgc 66 



<210> 324 
<211> 66 
<212> DNA 

<213> Homo sapiens 



<400> 324 
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cagtctgccc tgactcagcc tgcctccgtg tctgggtctc ctggacagtc gatcaccatc 60 
tcctgc 66 



<210> 325 _ 
<211> 66 
<212> DNA 

<213> Homo sapiens 
<400> 325 

tcctatgagc tgactcagcc accctcagtg tccgtgtccc caggacagac agccagcatc 60 
acctgc 66 



<210> 326 
<211> 66 
<212> DNA 

<213> Homo sapiens 
<400> 326 

tcctatgagc tgactcagcc actctcagtg tcagtggccc tgggacagac ggccaggatt 60 
acctgt 66 



<210> 327 
<211> 66 
<212> DNA 

<213> Homo sapiens 
<400> 327 

tcctatgagc tgacacagcc accctcggtg tcagtgtccc caggacaaac ggccaggatc 60 
acctgc 66 



<210> 328 
<211> 66 
<212> DNA 

<213> Homo sapiens 
<400> 328 

tcctatgagc tgacacagcc accctcggtg tcagtgtccc taggacagat ggccaggatc 60 
acctgc 66 



<210> 329 
<211> 66 
<212> DNA 

<213> Homo sapiens 
<400> 329 

tcttctgagc tgactcagga ccctgctgtg tctgtggcct tgggacagac agtcaggatc 60 
acatgc 66 



<210> 330 
<211> 66 
<212> DNA 

<213> Homo sapiens 
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<400> 330 

tcctatgtgc tgactcagcc accctcagtg tcagtggccc caggaaagac ggccaggatt 60 
acctgt 66 



<210> 331 

<211> 66 

<212> DNA 

<213> Homo sapiens 



<400> 331 

tcctatgagc tgacacagct accctcggtg tcagtgtccc caggacagac agccaggatc 60 
acctgc 66 



<210> 332 

<211> 66 

<212> DNA 

<213> Homo sapiens 

<400> 332 

tcctatgagc tgatgcagcc accctcggtg tcagtgtccc caggacagac ggccaggatc 60 
acctgc 66 



<210> 333 

<211> 66 

<212> DNA 

<213> Homo sapiens 

<400> 333 

tcctatgagc tgacacagcc atcctcagtg tcagtgtctc cgggacagac agccaggatc 60 
acctgc 66 



<210> 334 

<211> 66 

<212> DNA 

<213> Homo sapiens 

<400> 334 

ctgcctgtgc tgactcagcc cccgtctgca tctgccttgc tgggagcctc gatcaagctc 60 
acctgc 66 



<210> 335 

<211> 66 

<212> DNA 

<213> Homo sapiens 

<400> 335 

cagcctgtgc tgactcaatc atcctctgcc tctgcttccc tgggatcctc ggtcaagctc 60 
acctgc 66 



<210> 336 
<211> 66 
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<212> DNA 

<213> Homo sapiens 

<400> 336 

cagcttgtgc tgactcaatc gccctctgcc tctgcctccc tgggagcctc ggtcaagctc 60 
acctgc 66 



<210> 337 

<211> 66 

<212> DNA 

<213> Homo sapiens 

<400> 337 

cagcctgtgc tgactcagcc accttcctcc tccgcatctc ctggagaatc cgccagactc 60 
acctgc 66 



<210> 338 

<211> 66 

<212> DNA 

<213> Homo sapiens 

<400> 338 

caggctgtgc tgactcagcc ggcttccctc tctgcatctc ctggagcatc agccagtctc 60 
acctgc 66 



<210> 339 

<211> 66 

<212> DNA 

<213> Homo sapiens 

<400> 339 

cagcctgtgc tgactcagcc atcttcccat tctgcatctt ctggagcatc agtcagactc 60 
acctgc 66 



<210> 340 

<211> 66 

<212> DNA 

<213> Homo sapiens 

<400> 340 

aattttatgc tgactcagcc ccactctgtg tcggagtctc cggggaagac ggtaaccatc 60 
tcctgc * 66 



<210> 341 
<211> 66 
<212> DNA 

<213> Homo sapiens 
<400> 341 

cagactgtgg tgactcagga gccctcactg actgtgtccc caggagggac agtcactctc 60 
acctgt 66 



WO 02/083872 
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<210> 342 

<211> 66 

<212> DNA 

<213> Homo sapiens 

<400> 342 — 
caggctgtgg tgactcagga gccctcactg actgtgtccc caggagggac agtcactctc 60 
acctgt 66 



<210> 343 

<211> 66 

<212> DNA 

<213> Homo sapiens 

<400> 343 

cagactgtgg tgacccagga gccatcgttc tcagtgtccc ctggagggac agtcacactc 60 
acttgt 66 



<210> 344 

<211> 66 

<212> DNA 

<213> Homo sapiens 

<400> 344 

cagcctgtgc tgactcagcc accttctgca tcagcctccc tgggagcctc ggtcacactc 60 
acctgc 66 



<210> 345 

<211> 66 

<212> DNA 

<213> Homo sapiens 

<400> 345 

caggcagggc tgactcagcc accctcggtg tccaagggct tgagacagac cgccacactc 60 
acctgc 66 



<210> 346 
<211> 11 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modified_base 
<222> {1} . . (6) 

<223> A, T, C, G, other or unknown 



<400> 346 
nnnnnngact c 



11 
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<210> 347 
<21X> 11 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modif ied_base 
<222> (6) . . (11) 

<223> A, T, C, G, other or unknown 

<400> 347 
gagtcnnnnn n 



<210> 348 
<211> 11 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modif ied_base 
<222> (3) . . (9) 

<223> A, T, C, G, other or unknown 

<400> 348 
gcnnnnnnng c 

<210> 349 
<211> 11 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modif ied_base 
<222> (7) . . (11) 

<223> A, T, C, G, other or unknown 

<400> 349 
acctgcnnnn n 



<210> 350 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 350 

cacatccgtg ttgttcacgg atgtg 25 



<210> 351 
<211> 88 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 351 

aatagtagac tgcagtgtcc tcagccctta agctgttcat ctgcaagtag agagtattct 60 
tagagttgtc tctagactta gtgaagcg 88 



<210> 352 
<211> 88 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 352 

cgcttcacta agtctagaga caactctaag aatactctct acttgcagat gaacagctta 60 
agggctgagg acactgcagt ctactatt 88 



<210> 353 
<211> 95 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 353 

cgcttcacta agtctagaga caactctaag aatactctct acttgcagat gaacagctta 60 
agggctgagg acactgcagt ctactattgt gcgag 95 

<210> 354 
<211> 95 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 354 

cgcttcacta agtctagaga caactctaag aatactctct acttgcagat gaacagctTa 60 
agggctgagg acactgcagt ctactattgt acgag 95 



<210> 355 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 355 

cgcttcacta agtctagaga caac 24 



<210> 356 
<211> 15 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modifiedjbase 
<222> (8) . . (15) 

<223> A, T, C, G, other or unknown 
<400> 356 

cacctgcnnn nnnnn 15 



<210> 357 
<211> 17 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modified_base 
<222> (7) . . (17) 

<223> A, T, C, G, other or unknown 



<400> 357 

cagctcnnnn nnnnnnn 



17 
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<210> 358 
<211> 17 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modified_base 
<222> (7) . . (17) 

<223> A, T, C, G, other or unknown 
<400> 358 

gaagacnnnn nnnnnnn 

<210> 359 
<211> 17 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modified__base 
<222> (6) . . (17) 

<223> A, T/ C, G, other or unknown 
<400> 359 

gcagcnnnnn nnnnnnn 

<210> 360 
<211> 12 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modified_base 
<222> (7) . . (12) 

<223> Pi, .7, C, G, other or unknown 

<400> 360 
gaagacnnnn nn 

<210> 361 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modified_base 
<222> (7).. (22) 

<223> A, T, C, G, other or unknown 
<400> 361 

cttgagnnnn nnnnnnnnnn nn 

<210> 362 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modified_base 
<222> (6). .(19) 

<223> A, T, C, G, other or unknown 
<400> 362 

acggcnnnnn nnnnnnnnn 

<210> 363 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modified_base 
<222> (6) . . (18) 

<223> A, T, C, G, other or unknown 
<400> 363 

acggcnnnnn nnnnnnnn 

<210> 364 
<211> 12 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 
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<220> 

<221> modified_base 
<222> (7) . . (12) 

<223> A, T, C, G, other or unknown 

<400> 364 
gtatccnnnn nn 



<210> 365 
<211> 11 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modifiedjbase 
<222> (7).. (11) 

<223> A, T, C, G, other or unknown 

<400> 365 
actgggnnnn n 

<210> 366 
<211> 10 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modif ied_base 
<222> (6) . . (10) 

<223> A, T, C, G, other or unknown 

<400> 366 
ggatcnnnnn 



<210> 367 
<211> 11 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modified_base 
<222> (6).. (11) 



WO 02/083872 



79 

<223> A, T, C, G, other or unknown 

<400> 367 
gcatcnnnnn n 



<210> 368 
<211> 16 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modified_base 
<222> (7) (16) 

<223> A, T, C, G, other or unknown 

<400> 368 
gaggagnnnn nnnnnn 

<210> 369 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modified_base 
<222> (6) . . (19) 

<223> A, T, C, G, other or unknown 
<400> 369 

gggacnnnnn nnnnnnnnn 

<210> 370 
<211> 14 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modified_base 
<222> (7).. (14) 

<223> A, T, C, G, other or unknown 

<400> 370 
acctgcnnnn nnnn 
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<210> 371 

<211> 17 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modifiedjbase 

<222> {7).. (17) 

<223> A, T, C, G, other or unknown 

<400> 371 

ggcggannnn nnnnnnn 

<210> 372 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modified_base 
<222> (7).. (22) 

<223> A, T, C, G, other or unknown 
<400> 372 

ctgaagnnnn nnnnnnnnnn nn 

<210> 373 
<211> 11 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> rnodified_base 
<222> (6). .(11) 

<223> A, T, C, G, other or unknown 

<400> 373 
cccgcnnnnn n 



<210> 374 
<211> 18 
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<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modif ied_base 
<222> (6) . . (18) 

<223> A, T, C, G, other or unknown 
<400> 374 

ggatgnnnnn nnnnnnnn 

<210> 375 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modif ied_base 
<222> (7) . . (22) 

<223> A, T, C, G, other or unknown 
<400> 375 

ctggagnnnn nnnnnnnnnn nn 

<210> 376 
<211> 15 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modified_base . 
<222> (6) . . (15) 

<223> A, T, C, G, other or unknown 

<400> 376 
gacgcnnnnn nnnnn 



<210> 377 
<211> 13 
<212> DNA 

<213> Artificial Sequence 



<220> 
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<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modif ied_base 
<222> (6).. (13) 

<223> A, T, C, G, other or unknown 

<400> 377 
ggtgannnnn nnn 

<210> 378 
<211> 13 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modif ied_base 
<222> (6),. (13) 

<223> A, T, C, G, other or unknown 

<400> 378 
gaagannnnn nnn 

<210> 379 
<211> 10 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modifiedjoase 
<222> (6) . . (10) 

<223> A, T, C, G, other or unknown 

<400> 379 
gagtcnnnnn 

<210> 380 
<211> 2 6 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 
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<220> 

<221> modified_base 
<222> (7) . . (26) 

<223> A f T, C, G, other or unknown 
<400> 380 

tccracnnnn nnnnnnnnnn nnnnnn 



<210> 381 
<211> 11 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modif ied_base 
<222> (5) . . (11) 

<223> A, T, C, G, other or unknown 

<400> 381 
cctcnnnnnn n 



<210> 382 
<211> 10 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modif ied_base 
<222> (6) . . (10) 

<223> A, T, C, G, other or unknown 

<400> 382 
gagtcnnnnn 

<210> 383 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modif ied_base 
<222> (7).. (18) 

<223> A, T, C, G, other or unknown 



WO 02/083872 



84 

<400> 383 

cccacannnn nnnnnnnn 



<210> 384 
<211> 14 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modif ied_base 
<222> (6) . . (14) 

<223> A, T, C, G, other or unknown 

<400> 384 
gcatcnnnnn nnnn 

<210> 385 _ 

<211> 13 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modif ied_base 

<222> (6).. (13) 

<223> A, T, C, G, other or unknown 

<400> 385 
ggtgannnnn nnn 

<210> 386 

<211> 12 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modifiedjaase 

<222> (5) . . (12) 

<223> A, T, C, G, other or unknown 

<400> 386 
cccgnnnnnn nn 
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<210> 387 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modified_base 
<222> (6) . . (19) 

<223> A, T, C, G, other or unknown 
<400> 387 

ggatgnnnnn nnnnnnnnn 

<210> 388 
<211> 17 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modif ied_base 
<222> (7) . . (17) 

<223> A, T, C, G, other or unknown 
<400> 388 

gaccgannnn nnnnnnn 

<210> 389 
<211> 17 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modified_base 
<222> (7) . . (17) 

<223> A, T, C, G, other or unknown 
<400> 389 

cacccannnn nnnnnnn 



<210> 390 
<211> 17 
<212> DNA 
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<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modifiedjbase 
<222> (7).. (177 

<223> A, T,. C, G, other or unknown 
<400> 390 

caarcannnn nnnnnnn 

<210> 391 

<211> 20 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
probe 

<400> 391 

gctgtgtatt actgtgcgag 

<210> 392 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
probe 

<400> 392 

gccgtgtatt actgtgcgag 

<210> 393 

<211> 20 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
probe 

<400> 393 

gccgtatatt actgtgcgag 

<210> 394 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: Synthetic 
probe 

<400> 394 

gccgtgtatt actgtacgag 20 



<210> 395 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
probe 

<400> 395 

gccatgtatt actgtgcgag 20 



<210> 396 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 396 

cacatccgtg ttgttcacgg atgtg 25 



<210> 397 
<211> 88 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 397 

aatagtagac tgcagtgtcc tcagccctta agctgttcat ctgcaagtag agagtattct 60 
tagagttgtc tctagactta gtgaagcg 88 



<210> 398 
<211> 95 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 
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<400> 398 

cgcttcacta agtctagaga caactctaag aatactctct acttgcagat gaacagctta 60 
agggctgagg acactgcagt ctactattgt gcgag 95 

<210> 399 — 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 399 

cgcttcacta agtctagaga caac 24 



<210> 400 
<211> 44 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 400 

cacatccgtg ttgttcacgg atgtgggagg atggagactg ggtc 44 

<210> 401 
<211> 44 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 401 

cacatccgtg ttgttcacgg atgtgggaga gtggagactg agtc 44 

<210> 402 
<211> 44 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 402 

cacatccgtg ttgttcacgg atgtgggtgc ctggagactg cgtc 44 



<210> 403 
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<211> 44 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 403 

cacatccgtg ttgttcacgg atgtgggtgg ctggagactg cgtc 



<210> 404 
<211> 34 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 404 

cctctactct tgtcacagtg cacaagacat ccag 



<210> 405 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 405 

cctctactct tgtcacagtg 



<210> 406 
<211> 44 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 406 

ggaggatgga ctggatgtct tgtgcactgt gacaagagta gagg 



<210> 407 
<211> 44 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
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oligonucleotide 
<400> 407 

ggagagtgga ctggatgtct tgtgcactgt gacaagagta gagg 

<210> 408 
<211> 44 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 408 

ggtgcctgga ctggatgtct tgtgcactgt gacaagagta gagg 

<210> 409 
<211> 44 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 409 

ggtggctgga ctggatgtct tgtgcactgt gacaagagta gagg 

<210> 410 

<211> 44 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 410 

cacatccgtg ttgttcacgg atgtggatcg actgtccagg agac 

<210> 411 
<211> 44 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 411 

cacatccgtg ttgttcacgg atgtggactg tctgtcccaa ggcc 
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<210> 412 
<211> 44 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 412 

cacatccgtg ttgttcacgg atgtggactg actgtccagg agac 4 4 



<210> 413 
<211> 44 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 413 

cacatccgtg ttgttcacgg atgtggaccc tctgccctgg ggcc 



<210> 414 
<211> 59 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 414 

cctctgactg agtgcacaga gtgctttaac ccaaccggct agtgttagcg gttccccgg 59 



<210> 415 
<211> 69 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 415 

cctctgactg agtgcacaga gtgctttaac ccaaccggct agtgttagcg gttccccggg 60 
acagtcgat 69 



<210> 416 
<211> 69 
<212> DNA 
<213> Artificial 



Sequence 
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<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 416 

cctctgactg agtgcacaga gtgctttaac ccaaccggct agtgttagcg gttccccggg 60 
acagacagt 69 

<210> 417 
<211> 69 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 417 

cctctgactg agtgcacaga gtgctttaac ccaaccggct agtgttagcg gttccccggg 60 
acagtcagt 69 



<210> 418 
<211> 70 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 418 

cctctgactg agtgcacaga gtgctttaac ccaaccggct agtgttagcg gtstccccgg 60 
ggcagagggt ' * 70 



<210> 419 

<211> 24 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 419 

cctctgactg agtgcacaga gtgc 

<210> 420 
<211> 13 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 
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<220> 

<221> modified_base 
<222> (5) . . (9) 

<223> A, T, C, G, other or unknown 

<400> 420 
ggccnnnnng gcc 



<210> 421 
<211> 15 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modified_base 
<222> (4) . . U2) 

<223> A, T, C, G, other or unknown 

<400> 421 
ccannnnnnn nntgg 



<210> 422 
<211> 12 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modified_base 
<222> (4).. (9) 

<223> A, T, C, G, other or unknown 

<400> 422 
cgannnnnnt gc 

<210> 423 
<211> 11 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modified_base 
<222> (4).. (8) 



WO 02/083872 



94 

<223> A', T, C, G, other or unknown 

<400> 423 
gccnnnnngg c 

<210> 424 
<211> 10 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modified_base 
<222> (4) . . (7) 

<223> A, T, C, G, other or unknown 

<400> 424 
gatnnnnatc 



<210> 425 
<211> 11 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modif ied_base 
<222> (4) . . (8) 

<223> A, T, C, G, other or unknown 

<400> 425 
gacnnnnngt c 



<210> 426 
<211> 11 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modified_base 
<222> (4) . . (8) 

<223> A, T, C, G, other or unknown 

<400> 426 
gcannnnntg c 
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<210> 427 
<211> 12 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> raodifiedjbase 
<222> (7) . . (12) 

<223> A, T, C, G, other or unknown 

<400> 427 
gtatccnnnn nn 

<210> 428 
<211> 12 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modified_base 
<222> (4) . . (9) 

<223> A, T, G, G, other or unknown 

<400> 428 
gacnnnnnng tc 

<210> 429 
<211> 11 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modif ied_base 
<222> (4) . . <8) 

<223> A, T, C, G, other or unknown 

<400> 429 
ccannnnntg g 



<210> 430 
<211> 12 
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<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence? Synthetic 
oligonucleotide 

<220> 

<221> modified_base 
<222> (1) . . (6) 

<223> A, T f C, G, other or unknown 

<400> 430 
nnnnnngaga eg 

<210> 431 
<211> 12 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modified_base 
<222> (4) (9) 

<223> A, T, C, G, other or unknown 

<400> 431 
ccannnnnnt gg 

<210> 432 
<211> 10 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modified_base 
<222> (4) . . (7) 

<223> A, T, C, G, other or unknown 

<400> 432 
gaannnnttc 

<210> 433 
<211> 11 
<212> DNA 

<213> Artificial Sequence 



<220> 
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<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modif ied_base 
<222> (7) . . (11) 

<223> A, T, C, G, other or unknown 

<400> 433 
ggtctcnnnn n 

<210> 434 
<211> 16 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modif ied_base 
<222> (1)..(10) 

<223> A, T, C, G, other or unknown 

<400> 434 
nnnnnnnnnn ctcctc 



<210> 435 
<211> 15 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modified_base 
<222> (1)..(9) 

<223> A, T, C, G, other or unknown 

<400> 435 
nnnnnnnnnt ccgcc 



<210> 436 
<211> 13 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 



<220> 
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<221> modified_base 
<222> (5) . . (9) 

<223> A, T/ C, G, other or unknown 

<400> 436 
ggccnnnnng gcc 

<210> 437 
<211> 12 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modified_base 
<222> (4) . . (9) 

<223> A, T, C f G, other or unknown 

<400> 437 
ccannnnnnt gg 

<210> 438 

<211> 12 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modified_base 

<222> (4) . . (9) 

<223> A, T, C, G, other or unknown 

<400> 438 
gacnnnnnng tc 

<210> 439 
<211> 12 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modified_base 
<222> (4) . . (9) 

<223> A, T, C, G, other or unknown 
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<400> 439 
cgannnnnnt gc 

<210> 440 
<211> 11 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modifiedjaase 
<222> (4).. (8) 

<223> A, T, C, G, other or unknown 

<400> 440 
gcannnnntg c 



<210> 441 
<211> 11 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modif ied_base 
<222> (4).. (8) 

<223> A, T, C, G, other or unknown 

<400> 441 
ccannnnntg g 

<210> 442 
<211> 10 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modifiedjbase 
<222> (4) . . (7) 

<223> A, T, C, G, other or unknown 

<400> 442 
gaannnnttc 
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<210> 443 
<211> 12 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modified_base 
<222> (1)..<6) 

<223> A, T, C, G, other or unknown 

<400> 443 
nnnnnngaga eg 

<210> 444 

<211> 12 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modif ied_base 

<222> (7) . . (12) 

<223> A, T, C, G, other or unknown 

<400> 444 
gtatccnnnn nn 

<210> 445 

<211> 11 

<212> DNA 

-<213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modif ied_base 
<222> (4) . . (8) 

<223> A, C, G, other or unknown 

<400> 445 
gaennnnngt c 

<210> 446 
<211> 11 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modifiedjbase 
<222> (7) , . (11) 

<223> A, T, C, G, other or unknown 

<400> 446 
ggtctcnnnn n 

<210> 447 
<211> 11 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modified_base 
<222> (4) . . (8) 

<223> A, T, C, G, other or unknown 

<400> 447 
gccnnnnngg c 

<210> 448 
<211> 15 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modif iedjbase 
<222> (4).. {12) 

<223> A, T, C, G, other or unknown 

<400> 448 
ccannnnnnn nntgg 

<210> 449 
<211> 16 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 
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<220> 

<221> modif iedjbase 
<222> (1) . . (10) 

✓ o i -> ^ it «p r> r* -.4-1 i . — 

^tt J' r\, x f \* t o r \j L11C1 Ui uiiMiuwa 

<400> 449 
nnnnnnnnnn ctcctc 



<210> 450 
<211> 15 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modif iedjbase 
<222> (1)..(9) 

<223> A, T, C, G, other or unknown 

<400> 450 
nnnnnnnnnt ccgcc 

<210> 451 
<211> 9532 
<212> DNA 

<213> Unknown Organism 
<220> 

<223> Description of Unknown Organism: MALIA3 nucleotide 
sequence 

<220> 
<221> CDS 

<222> (1579) . . (1638) 

<220> 
<221> CDS 

<222> (2343) . . (3443) 

<220> 
<221> CDS 

<222> (3945) . . (4400) 

<220> 
<221> CDS 

<222> (4406) . . (4450) 

<220> 
<221> CDS 

<222> (4746) (5789) 



<400> 451 
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aatgctacta ctattagtag 


aattgatgcc accttttcag 


ctcgcgcccc 


aaatgaaaat 


60 


atagctaaac aggttattga 


ccatttgcga aatgtatcta 


atggtcaaac 


taaatctact 


120 


cgttcgcaga attgggaatc 


aactgttaca tggaatgaaa 


cttccagaca 


ccgtacttta 


180 


gttgcatatt taaaacatgt 


tgagctacag caccagattc 


agcaattaag 


ctctaagcca 


240 


tccgcaaaaa tgacctctta 


tcaaaaggag caattaaagg 


tactctctaa 


tcctgacctg 


300 


ttggagtttg cttccggtct 


ggttcgcttt gaagctcgaa 


ttaaaacgcg 


atatttgaag 


360 


tctttcgggc ttcctcttaa 


tctttttgat gcaatccgct 


ttgcttctga 


ctataatagt 


420 


cagggtaaag acctgatttt 


tgatttatgg tcattctcgt 


tttctgaact 


gtttaaagca 


480 


tttgaggggg attcaatgaa 


tatttatgac gattccgcag 


tattggacgc 


tatccagtct 


540 


aaacatttta ctattacccc 


ctctggcaaa acttcttttg 


caaaagcctc 


tcgctatttt 


600 


ggtttttatc gtcgtctggt 


aaacgagggt tatgatagtg 


ttgctcttac 


tatgcctcgt 


660 


aattcctttt ggcgttatgt 


atctgcatta gttgaatgtg 


gtattcctaa 


atctcaactg 


720 


atgaatcttt ctacctgtaa 


taatgttgtt ccgttagttc 


gttttattaa 


cgtagatttt 


780 


tcttcccaac gtcctgactg 


gtataatgag ccagttctta 


aaatcgcata 


aggtaattca 


840 


caatgattaa agttgaaatt 


aaaccatctc aagcccaatt 


tactactcgt 


tctggtgttt 


900 


ctcgtcaggg caagccttat 


tcactgaatg agcagctttg 


ttacgttgat 


ttgggtaatg 


960 


aatatccggt tcttgtcaag 


attactcttg atgaaggtca 


gccagcctat 


gcgcctggtc 


1020 


tgtacaccgt tcatctgtcc 


tctttcaaag ttggtcagtt 


cggttccctt 


atgattgacc 


1080 


gtctgcgcct cgttccggct 


aagtaacatg gagcaggtcg 


cggatttcga 


cacaatttat 


1140 


caggcgatga tacaaatctc 


cgttgtactt tgtttcgcgc 


ttggtataat 


cgctgggggt 


1200 


caaagatgag tgttttagtg 


tattctttcg cctctttcgt 


tttaggttgg 


tgccttcgta 


1260 


gtggcattac gtattttacc 


cgtttaatgg aaacttcctc 


atgaaaaagt 


ctttagtcct 


1320 


caaagcctct gtagccgttg 


ctaccctcgt tccgatgctg 


tctttcgctg 


ctgagggtga 


1380 


cgatcccgca aaagcggcct 


ttaactccct gcaagcctca 


gcgaccgaat 


atatcggtta 


1440 


tgcgtgggcg atggttgttg 


tcattgtcgg cgcaactatc 


ggtatcaagc 


tgtttaagaa 


1-500 


attcacctcg aaagcaagct 


gataaaccga tacaattaaa 


ggctcctttt 


ggagcctttt 


1560 


tttttggaga ttttcaac gtg aaa aaa tta tta ttc 
Met Lys Lys Leu Leu Phe 
1 5 


gca att cct tta gtt 
Ala lie Pro Leu Val 
10 


1611 



gtt cct ttc tat tct cac agt gca cag tctgtcgtga cgcagccgcc 1658 
Val Pro Phe Tyr Ser His Ser Ala Gin 
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15 20 

ctcagtgtct ggggccccag ggcagagggt caccatctcc tgcactggga gcagctccaa 1718 

catcggggea ggttatgatg tacactggta ccagcagctt ccaggaacag cccccaaact 1778 

cctcatctat ggtaacagca atcggccctc aggggtccct gaccgattct ctggctccaa 1838 

gtctggcacc tcagcctccc tggccatcac tgggctccag gctgaggatg aggctgatta 18 98 

ttactgccag tcctatgaca gcagcctgag tggcctttat gtcttcggaa ctgggaccaa 1958 

ggtcaccgtc ctaggtcagc ccaaggccaa ccccactgtc actctgttcc cgccctcctc 2018 

tgaggagctc caagccaaca aggccacact agtgtgtctg atcagtgact tctacccggg 2078 

agctgtgaca gtggcctgga aggcagatag cagccccgtc aaggcgggag tggagaccac 2138 

cacaccctcc aaacaaagca acaacaagta cgcggccagc agctatctga gcctgacgcc 2198 

tgagcagtgg aagtcccaca gaagctacag ctgccaggtc acgcatgaag ggagcaccgt 2258 

ggagaagaca gtggccccta cagaatgttc ataataaacc gcctccaccg ggcgcgccaa 2318 

ttctatttca aggagacagt cata atg aaa tac eta ttg cct acg gca gec 2369 

Met Lys Tyr Leu Leu Pro Thr Ala Ala 
25 

get gga ttg tta tta etc gcg gee cag ccg gee atg gee gaa gtt caa 2417 
Ala Gly Leu Leu Leu Leu Ala Ala Gin Pro Ala Met Ala Glu Val Gin 
30 35 40 45 

ttg tta gag tct ggt ggc ggt ctt gtt cag cct ggt ggt tct tta cgt 2465 
Leu Leu Glu Ser Gly Gly Gly Leu Val Gin Pro Gly Gly Ser Leu Arg 
50 ~ 55 60 

ctt tct tgc get get tec gga ttc act ttc tct teg tac get atg tct 2513 
Leu Ser Cys Ala Ala Ser Gly Phe Thr Phe Ser Ser Tyr Ala Met Ser 
65 70 75 

tgg gtt cgc caa get cct ggt aaa ggt ttg gag tgg gtt tct get ate 2561 
Trp Val Arg Gin Ala Pro Gly Lys Gly Leu Glu Trp Val Ser Ala lie 
80 85 90 

tct ggt tct ggt ggc agt act tac tat get gac tec gtt aaa ggt cgc 2609 
Ser Gly Ser Gly Gly Ser Thr Tyr Tyr Ala Asp Ser Val Lys Gly Arg 
95 100 105 

ttc act ate tct aga gac aac tct aag aat act etc tac ttg cag atg 2657 
Phe Thr lie Ser Arg Asp Asn Ser Lys Asn Thr Leu Tyr Leu Gin Met 
110 115 120 125 

aac age tta agg get gag gac act gca gtc tac tat tgc get aaa gac 2705 
Asn Ser Leu Arg Ala Glu Asp Thr Ala Val Tyr Tyr Cys Ala Lys Asp 
130 135 140 

tat gaa ggt act ggt tat get ttc gac ata tgg ggt caa ggt act atg 2753 
Tyr Glu Gly Thr Gly Tyr Ala Phe Asp He Trp Gly Gin Gly Thr Met 
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145 150 155 

gtc acc gtc tct agt gcc tec acc aag ggc cca teg gtc ttc ccc ctg 
Val Thr Val Ser Ser Ala Ser Thr Lys Gly Pro Ser Val Phe Pro Leu 
160 165 170 . . 

gca ccc tec tec aag age acc tct ggg ggc aca gcg gcc ctg ggc tgc 
Ala Pro Ser Ser Lys Ser Thr Ser Gly Gly Thr Ala Ala Leu Gly Cys 
175 180 185 



acc aag gtg gac aag aaa gtt gag ccc aaa tct tgt gcg gcc get cat 
Thr Lys Val Asp Lys Lys Val Glu Pro Lys Ser Cys Ala Ala Ala His 
255 " 260 265 



2801 



2849 



ctg gtc aag gac tac ttc ccc gaa ccg gtg acg gtg teg tgg aac tea 2897 

Leu Val Lys Asp Tyr Phe Pro Glu Pro Val Thr Val Ser Trp Asn Ser 
190 195 200 205 

ggc gcc ctg acc age ggc gtc cac acc ttc ccg get gtc eta cag tct 2945 

Gly Ala Leu Thr Ser Gly Val His Thr Phe Pro Ala Val Leu Gin Ser 
210 215 220 

age gga etc tac tec etc age age gta gtg acc gtg ccc tct tct age 

Ser Gly Leu Tyr Ser Leu Ser Ser Val Val Thr Val Pro Ser Ser Ser 

225 230 235 

ttg ggc acc cag acc tac ate tgc aac gtg aat cac aag ccc age aac 3041 

Leu Gly Thr Gin Thr Tyr He Cys Asn Val Asn His Lys Pro Ser Asn 

240 245 250 



2993 



3089 



cac cac cat cat cac tct get gaa caa aaa etc ate tea gaa gag gat 3137 

His His His His His Ser Ala Glu Gin Lys Leu He Ser Glu Glu Asp 
270 275 280 285 

ctg aat ggt gcc gca gat ate aac gat gat cgt atg get ggc gcc get 3185 

Leu Asn Gly Ala Ala Asp He Asn Asp Asp Arg Met Ala Gly Ala Ala 

290 295 300 

gaa act gtt gaa agt tgt tta gca aaa ccc cat aca gaa aat tea ttt 3233 

Glu Thr Val Glu Ser Cys Leu Ala Lys Pro His Thr Glu Asn Ser Phe 

305 310 315 

act aac gtc tgg aaa gac gac aaa act tta gat cgt tac get aac tat 3281 

Thr Asn Val Trp Lys Asp Asp Lys Thr Leu Asp Arg Tyr Ala Asn Tyr 

320 325 330 

gag ggt tgt ctg tgg aat get aca ggc gtt gta gtt tgt act ggt gac 3329 

Glu Gly Cys Leu Trp Asn Ala Thr Gly Val Val Val Cys Thr Gly Asp 

335 340 345 

gaa act cag tgt tac ggt aca tgg gtt cct att ggg ctt get ate cct 3377 

Glu Thr Gin Cys Tyr Gly Thr Trp Val Pro He Gly Leu Ala He Pro 
350 355 360 365 

gaa aat gag ggt ggt ggc tct gag ggt ggc ggt tct gag ggt ggc ggt 3425 
Glu Asn Glu Gly Gly Gly Ser Glu Gly Gly Gly Ser Glu Gly Gly Gly 

370 375 380 ■ 
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tct gag ggt ggc ggt act aaacctcctg agtacggtga tacacctatt 3473 
Ser Glu Gly Gly Gly Thr 
385 



cccgctaatc 


ctaatccttc 


tcttgaggag 


tctcagcctc 


ttaatacttt 


catgtttcag 


3593 


aataataggt 


tccgaaatag 


gcagggggca 


ttaactgttt 


atacgggcac 


tgttactcaa 


3653 


ggcactgacc 


ccgttaaaac 


ttattaccag 


tacactcctg 


tatcatcaaa 


agccatgtat 


3713 


gacgcttact 


ggaacggtaa 


attcagagac 


tgcgctttcc 


attctggctt 


taatgaagat 


3773 


ccattegttt 


gtgaatatca 


aggccaatcg 


tctgacctgc 


ctcaacctcc 


tgtcaatgct 


3833 


ggcggcggct 


ctggtggtgg 


ttctggtggc 


ggctctgagg 


gtggtggctc 


tgagggtggc 


3893 


ggttctgagg 


gtggcggctc 


tgagggaggc 


ggttccggtg 


gtggctctgg 


t tec ggt 


3950 



Ser Gly 

gat ttt gat tat gaa aag atg gca aac get aat aag ggg get atg ace 3998 
Asp Phe Asp Tyr Glu Lys Met Ala Asn Ala Asn Lys Gly Ala Met Thr 
390 395 400 405 

gaa aat gee gat gaa aac gcg eta cag tct gac get aaa ggc aaa ctt 4046 
Glu Asn Ala Asp Glu Asn Ala Leu Gin Ser Asp Ala Lys Gly Lys Leu 
410 415 420 

gat tct gtc get act gat tac ggt get get ate gat ggt ttc att ggt 4094 
Asp Ser Val Ala Thr Asp Tyr Gly Ala Ala He Asp Gly Phe He Gly 
425 430 435 

gac gtt tec ggc ctt get aat ggt aat ggt get act ggt gat ttt get 4142 
Asp Val Ser Gly Leu Ala Asn Gly Asn Gly Ala Thr Gly Asp Phe Ala 
440 445 450 

ggc tct aat tec caa atg get caa gtc ggt gac ggt gat aat tea cct 4190 
Gly Ser Asn Ser Gin Met Ala Gin Val Gly Asp Gly Asp Asn Ser Pro 
455 460 465 

tta atg aat aat ttc cgt caa tat tta cct tec etc cct caa teg gtt 4238 
Leu Met Asn Asn Phe Arg Gin Tyr Leu Pro Ser Leu Pro Gin Ser Val 
470 475 480 485 

gaa tgt cgc cct ttt gtc ttt age get ggt aaa cca tat gaa ttt tct 4286 
Glu Cys Arg Pro Phe Val Phe Ser Ala Gly Lys Pro Tyr Glu Phe Ser 
490 495 500 

att gat tgt gac aaa ata aac tta ttc cgt ggt gtc ttt gcg ttt ctt 4334 
He Asp Cys Asp Lys He Asn Leu Phe Arg Gly Val Phe Ala Phe Leu 
505 510 515 

tta tat gtt gee ace ttt atg tat gta ttt tct acg ttt get aac ata 4382 
Leu Tyr Val Ala Thr Phe Met Tyr Val Phe Ser Thr Phe Ala Asn He 
520 525 530 

ctg cgt aat aag gag tct taatc atg cca gtt ctt ttg ggt att ccg tta 4432 
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Leu Arg Asn Lys Glu Ser Met Pro Val Leu Leu Gly He Pro Leu 

535 540 545 

tta ttg cgt ttc etc ggt ttccttctgg taactttgtt eggctatctg 4480 
Leu Leu Arg Phe Leu Gly . — 

550 

cttacttttc ttaaaaaggg etteggtaag atagctattg ctatttcatt gtttcttget 4540 

cttattattg ggcttaactc aattcttgtg ggttatctct ctgatattag cgctcaatta 4600 

ccctctgact ttgttcaggg tgttcagtta attctcccgt etaatgeget tccctgtttt 4 660 

tatgttattc tctctgtaaa ggctgetatt ttcatttttg aegttaaaca aaaaatcgtt 4720 

tcttatttgg attgggataa ataat atg get gtt tat ttt gta act ggc aaa 4772 

Met Ala Val Tyr Phe Val Thr Gly Lys 
555 560 

tta ggc tct gga aag acg etc gtt age gtt ggt aag att cag gat aaa 4820 
Leu Gly Ser Gly Lys Thr Leu Val Ser Val Gly Lys He Gin Asp Lys 
565 570 575 

att gta get ggg tgc aaa ata gca act aat ctt gat tta agg ctt caa 
He Val Ala Gly Cys Lys He Ala Thr Asn Leu Asp Leu Arg Leu Gin 
580 585 590 595 

aac etc ccg caa gtc ggg agg ttc get aaa acg cct cgc gtt ctt aga 
Asn Leu Pro Gin Val Gly Arg Phe Ala Lys Thr Pro Arg Val Leu Arg 
600 605 610 

ata ccg gat aag cct tct ata tct gat ttg ctt get att ggg cgc ggt 
He Pro Asp Lys Pro Ser He Ser Asp Leu Leu Ala He Gly Arg Gly 
615 620 625 

aat gat tec tac gat gaa aat aaa aac ggc ttg ctt gtt etc gat gag 
Asn Asp Ser Tyr Asp Glu Asn Lys Asn Gly Leu Leu Val Leu Asp Glu 
630 635 640 

tgc ggt act tgg ttt aat acc cgt tct tgg aat gat aag gaa aga cag 
Cys Gly Thr Trp Phe Asn Thr Arg Ser Trp Asn Asp Lys Glu Arg Gin 
645 650 655 

ccg att att gat tgg ttt eta cat get cgt aaa tta gga tgg gat att 
Pro He He Asp Trp Phe Leu His Ala Arg Lys Leu Gly Trp Asp He 
660 665 670 675 

att ttt ctt gtt cag gac tta tct att gtt gat aaa cag gcg cgt tct 5156 
He Phe Leu Val Gin Asp Leu Ser He Val Asp Lys Gin Ala Arg Ser 
680 685 690 

gca tta get gaa cat gtt gtt tat tgt cgt cgt ctg gac aga att act 5204 
Ala Leu Ala Glu His Val Val Tyr Cys Arg Arg Leu Asp Arg He Thr 
695 "700 705 

tta cct ttt gtc ggt act tta tat tct ctt att act ggc teg aaa atg 5252 
Leu Pro Phe Val Gly Thr Leu Tyr Ser Leu He Thr Gly Ser Lys Met 
710 * 715 720 



4868 



4916 



4964 



5012 



5060 



5108 
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cct ctg cct aaa tta cat gtt ggc gtt gtt aaa tat ggc gat tct caa 5300 
Pro Leu Pro Lys Leu His Val Gly Val Val Lys Tyr Gly Asp Ser Gin 
725 730 735 

tta age cct act gtt gag cgt tgg ctt tat act ggt aag aat ttg tatT" 5348 
Leu Ser Pro Thr Val Glu Arg Trp Leu Tyr Thr Gly Lys Asn Leu Tyr 
740 745 750 755 

aac gca tat gat act aaa cag get ttt tct agt aat tat gat tec ggt 5396 
Asn Ala Tyr Asp Thr Lys Gin Ala Phe Ser Ser Asn Tyr Asp Ser Gly 
760 765 770 

gtt tat tct tat tta acg cct tat tta tea cac ggt egg tat ttc aaa 5444 
Val Tyr Ser Tyr Leu Thr Pro Tyr Leu Ser His Gly Arg Tyr Phe Lys 
775 780 785 

cca tta aat tta ggt cag aag atg aaa tta act aaa ata tat ttg aaa 5492 
Pro Leu Asn Leu Gly Gin Lys Met Lys Leu Thr Lys lie Tyr Leu Lys 
790 795 800 

aag ttt tct cgc gtt ctt tgt ctt gcg att gga ttt gca tea gca ttt 5540 
Lys Phe Ser Arg Val Leu Cys Leu Ala He Gly Phe Ala Ser Ala Phe 
805 810 815 

aca tat agt tat ata ace caa cct aag ccg gag gtt aaa aag gta gtc 5588 
Thr Tyr Ser Tyr He Thr Gin Pro Lys Pro Glu Val Lys Lys Val Val 
820 825 830 835 

tct cag ace tat gat ttt gat aaa ttc act att gac tct tct cag cgt 5636 
Ser Gin Thr Tyr Asp Phe Asp Lys Phe Thr He Asp Ser Ser Gin Arg 
840 845 850 

ctt aat eta age tat cgc tat gtt ttc aag gat tct aag gga aaa tta 5684 
Leu Asn Leu Ser Tyr Arg Tyr Val Phe Lys Asp Ser Lys Gly Lys Leu 
855 860 865 

att aat age gac gat tta cag aag caa ggt tat tea etc aca tat att 5732 
He Asn Ser Asp Asp Leu Gin Lys Gin Gly Tyr Ser Leu Thr Tyr He 
870 875 880 

gat tta tgt act gtt tec att aaa aaa ggt aat tea aat gaa att gtt 5780 
Asp Leu Cys Thr Val Ser He Lys Lys Gly Asn Ser Asn Glu He Val 
885 890 895 

aaa tgt aat taattttgtt ttcttgatgt ttgtttcatc atcttctttt 5829 

Lys Cys Asn 

900 

gctcaggtaa ttgaaatgaa taattcgect ctgegegatt ttgtaacttg gtattcaaag 5889 
caatcaggcg aatccgttat tgtttctccc gatgtaaaag gtactgttac tgtatattca 5949 
tetgaegtta aacctgaaaa tetaegcaat ttctttattt ctgttttacg tgctaataat 6009 
tttgatatgg ttggttcaat tccttccata attcagaagt ataatccaaa caatcaggat 6069 
tatattgatg aattgecate atctgataat caggaatatg atgataattc cgctccttct 6129 
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ggtggtttct ttgttccgca aaatgataat 
cgggcaaagg atttaatacg agttgtcgaa 
tcaaatgtat tatctattga cggctctaat 
ttagataacc ttcctcaatt cctttctact 
gagggtttga tatttgaggt tcagcaaggt 
tctcagcgtg gcactgttgc aggcggtgtt 
tctgctggtg gttcgttcgg tatttttaat 
ttaaagacta atagccattc aaaaatattg 
cagaagggtt ctatctctgt tggccagaat 
gaatctgcca atgtaaataa tccatttcag 
atgagcgttt ttcctgttgc aatggctggc 
gccgatagtt tgagttcttc tactcaggca 
gctacaacgg ttaatttgcg tgatggacag 
aaaaacactt ctcaagattc tggcgtaccg 
ctgtttagct cccgctctga ttccaacgag 
accatagtac gcgccctgta gcggcgcatt 
cgtgaccgct acacttgcca gcgccctagc 
tctcgccacg ttcgccggct ttccccgtca 
ccgatttagt gctttacggc acctcgaccc 
tagtgggcca tcgccctgat agacggtttt 
taatagtgga ctcttgttcc aaactggaac 
tgatttataa gggattttgc cgatttcgga 
gggcaaacca gcgtggaccg cttgctgcaa 
cagctgttgc ccgtctcact ggtgaaaaga 
gcacttttcg gggaaatgtg cgcggaaccc 
atatgtatcc gctcatgaga caataaccct 
agagtatgag tattcaacat ttccgtgtcg 
ttcctgtttt tgctcaccca gaaacgctgg 
gcgcacgagt gggttacatc gaactggatc 
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gttactcaaa 


cttttaaaat 


taataacgtt 


6189 


ttgtttgtaa 


agtctaatac 


ttctaaatcc 


6249 


ctattagttg 


tttctgcacc 


taaagatatt 


6309 


gttgatttgc 


caactgacca 


gatattgatt 


6369 


gatgctttag 


atttttcatt 


tgctgctggc 


6429 


aatactgacc 


gcctcacctc 


tgttttatct 


6489 


ggcgatgttt 


tagggctatc 


agttcgcgca 


6549 


tctgtgccac 


gtattcttac 


gctttcaggt 


6609 


gtccctttta 


ttactggtcg 


tgtgactggt 


6669 


acgattgagc 


gtcaaaatgt 


aggtatttcc 


6729 


ggtaatattg 


ttctggatat 


taccagcaag 


6789 


agtgatgtta 


ttactaatca 


aagaagtatt 


6849 


actcttttac 


tcggtggcct 


cactgattat 


6909 


ttcctgtcta 


aaatcccttt 


aatcggcctc 


6969 


gaaagcacgt 


tatacgtgct 


cgtcaaagca 


7029 


aagcgcggcg 


ggtgtggtgg 


ttacgcgcag 


7089 


gcccgctcct 


ttcgctttct 


tcccttcctt 


7149 


agctctaaat 


cgggggctcc 


ctttagggtt 


7209 


caaaaaactt 


gatttgggtg 


atggttcacg 


7269 


tcgccctttg 


acgttggagt 


ccacgttctt 


7329 


aacactcaac 


cctatctcgg 


gctattcttt 


7389 


accaccatca 


aacaggattt 


tcgcctgctg 


7449 


ctctctcagg 


gccaggcggt 


gaagggcaat 


7S09 


aaaaccaccc 


tggatccaag 


cttgcaggtg 


7569 


ctatttgttt 


atttttctaa 


atacattcaa 


7629 


gataaatgct 


tcaataatat 


tgaaaaagga 


7689 


cccttattcc 


cttttttgcg 


gcattttgcc 


7749 


tgaaagtaaa 


agatgctgaa 


gatcagttgg 


7809 


tcaacagcgg 


taagatcctt 


gagagttttc 


7869 
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rx r* c f r* n a a ci a 
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a cga rgagca 


/->4-4-4-4-^-jrirTt- 
CLlLtdddgL 


uc tgcta cgt 


catacactat 
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tc tcagaatg 
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aLuiyy ctya 


/r 2 ^ a ^ ^ 
y LdCUCaCCd 


gticacagaaa 


agcdCCLLdC 


ggatggcatg 


acagtaagag 




a. a Ltd uyi^ay 


4~ f^w r*> ¥* /** /~* t— n 4- a 

LyLtyccdLQ 


acca tgagtg 


4~ a/>a/i4- /ir 

dLodCaCLyC 


ggccaactta 


cttctgacaa 


oiuy 


tya Lcyyay y 


accy aaggag 


ctaaccgctt 


t tttgcacaa 


catgggggat 


catgtaactc 


8169 


yccctydicy 


t igggaaccg 


gagctgaatg 


aagccatacc 


aaacgacgag 


cgtgacacca 


8229 


CyStyCCty c 


dyCadLgCCd 


acaacgttgc 


gcaaactatt 


aactggcgaa 


ctacttactc 


8289 


4- ~5 rt f** 4" r**r* r*t 
LayC LLLLty 


yCaaCdd C La 


at agactgga 


tggaggcgga 


taaagttgca 


ggaccacttc 


8349 


4- y^r /r 4- r+nrtr~* 

uy cyLLLygc 


L.LLLCLyyCL 


y gcuggcu ca 


ttgctga taa 


atctggagcc 


ggtgagcgtg 


8409 


yy l-v ucy v^y y 


LdlL-dLLyud 


n /~f or or /t /~* 

y cat- ty gy y c 


caga tgg uaa 


gccctcccgt 


atcgtagtta 


84 69 


I- LaLauydL 


y y y y ay LCdy 


yCadCldLyy 


a cgaacgaaa 


tagacagatc 


gctgagatag 


8529 


y i-yut LL-aLL 


naff aartpah 
ydL Laoywa t 


4- y™v 4" a a ^ +~ 4— 

LggLddCty l 


cagaccaagti 


4-4--3/-»4-j~.-»4--i+- 

ttactcatat 


atactttaga 


8589 


U Lya LL ladd 


dCLLCat LLL 


taa t t taa a a 


ggatctaggt 


gaagatcctt 


tttgataatc 


8649 


LLdiyaLLda 


aatCCCLLaa 


cgtgagtttt 


cgttccactg 


tacgtaagac 


ccccaagctt 


8709 


y Ltyat uydd 


Lyytydd Ly g 


cgctttgcct 


ggtttccggc 


accagaagcg 


gtgccggaaa 


8769 


y i, i»yyL uyya 


yt.ycgat.Ccu 


cc cgaggccg 


atactgtcgt 


cgtcccctca 


aactggcaga 


O O *> ft 

8829 


ty tauyy l. ua 


v»yd tgcyocc 


a tctacacca 


acgtaaccta 


tcccattacg 


gtcaatccgc 


8889 


Uy LLLy LLLL 


CdCygdyddL 


ccgacgggt t 


gttactcgct 


cacatttaat 


gttgatgaaa 


8949 


or*trrrrfr*t"j»r»;a 
yuLyyuLdUd 


yyddyyCCdy 


dCyCyddCLd 


I, l uc cgaugg 


cgttcctatt 


ggttaaaaaa 


ft Aft fi 

9009 


Ly ayu Lya L L 


f aaoaaaaaf 
LddtdddddL 


LtddCyCydd 


ttttaacaaa 


atattaacgt 


ttacaattta 


9069 


<Ja Ld L L LyLU 


Laid Cd alCt 


4> ^ ^ 4- ^4- 4-4-4-4- 

tcct g L LLLL 


ggggcttttc 


tgattatcaa 


ccggggtaca 


9129 


tatgattgac 


atgctagttt 


tacgattacc 


gttcatcgat 


tctcttgttt 


gctccagact 


9189 


ctcaggcaat 


gacctgatag 


cctttgtaga 


tctctcaaaa 


atagctaccc 


tctccggcat 


9249 


gaatttatca 


gctagaacgg 


ttgaatatca 


tattgatggt 


gatttgactg 


tctccggcct 


9309 


ttctcaccct 


tttgaatctt 


tacctacaca 


ttactcaggc 


attgcattta 


aaatatatga 


9369 


gggttctaaa 


aatttttatc 


cttgcgttga 


aataaaggct 


tctcccgcaa 


aagtattaca 


9429 


gggtcataat 


gtttttggta 


caaccgattt 


agctttatgc 


tctgaggctt 


tattgcttaa 


9489 


ttttgctaat 


tctttgcctt 


gcctgtatga 


tttattggat 


gtt 




9532 
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<210> 452 
<211> 20 
<212> PRT 

<213> Unknown Organism 
<220> 

<223> Description of Unknown Organism: MALI A3 peptide 
sequence 

<400> 452 

Met Lys Lys Leu Leu Phe Ala He Pro Leu Val Val Pro Phe Tyr Ser 
! 5 10 15 

His Ser Ala Gin 
20 



<210> 453 
<211> 367 
<212> PRT 

<213> Unknown Organism 
<220> 

<223> Description of Unknown Organism: MALIA3 protein 
sequence 

<400> 453 

Met Lys Tyr Leu Leu Pro Thr Ala Ala Ala Gly Leu Leu Leu Leu Ala 
15 10 15 

Ala Gin Pro Ala Met Ala Glu Val Gin Leu Leu Glu Ser Gly Gly Gly 
20 25 30 

Leu Val Gin Pro Gly Gly Ser Leu Arg Leu Ser Cys Ala Ala Ser Gly 
35 40 45 

Phe Thr Phe Ser Ser Tyr Ala Met Ser Trp Val Arg Gin Ala Pro Gly 
50 55 60 

Lys Gly Leu Glu Trp Val Ser Ala He Ser Gly Ser Gly Gly Ser Thr 
65 70 75 80 

Tvr Tyr Ala Asp Ser Val Lys Gly Arg Phe Thr He Ser Arg Asp Asn 
85 90 95 

Ser Lys Asn Thr Leu Tyr Leu Gin Met Asn Ser Leu Arg Ala Glu Asp 
100 105 HO 

Thr Ala Val Tyr Tyr Cys Ala Lys Asp Tyr Glu Gly Thr Gly Tyr Ala 
115 120 125 

Phe Asp He Trp Gly Gin Gly Thr Met Val Thr Val Ser Ser Ala Ser 
130 * 135 140 

Thr Lys Gly Pro Ser Val Phe Pro Leu Ala Pro Ser Ser Lys Ser Thr 
145 150 155 160 

Ser Gly Gly Thr Ala Ala Leu Gly Cys Leu Val Lys Asp Tyr Phe Pro 
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112 

170 



175 



Glu Pro Val Thr Val Ser Trp Asn Ser Gly Ala Leu Thr Ser Gly Val 
180 185 190 

His Thr Phe Pro Ala Val Leu Gin Ser Ser Gly Leu Tyr Ser Leu Ser 
195 200 205 

Ser Val Val Thr Val Pro Ser Ser Ser Leu Gly Thr Gin Thr Tyr lie 
210 215 220 

Cys Asn Val Asn His Lys Pro Ser Asn Thr Lys Val Asp Lys Lys Val 
225 230 235 240 

Glu Pro Lys Ser Cys Ala Ala Ala His His His His His His Ser Ala 
245 250 255 

Glu Gin Lys Leu lie Ser Glu Glu Asp Leu Asn Gly Ala Ala Asp lie 
260 265 270 

Asn Asp Asp Arg Met Ala Gly Ala Ala Glu Thr Val Glu Ser Cys Leu 
275 280 285 



Ala Lys Pro His Thr Glu Asn Ser Phe Thr Asn Val Trp Lys Asp Asp 
290 295 300 

Lys Thr Leu Asp Arg Tyr Ala Asn Tyr Glu Gly Cys Leu Trp Asn Ala 
305 310 315 320 

Thr Gly Val Val Val Cys Thr Gly Asp Glu Thr Gin Cys Tyr Gly Thr 
325 330 335 

Trp Val Pro lie Gly Leu Ala lie Pro Glu Asn Glu Gly Gly Gly Ser 
340 345 350 

Glu Gly Gly Gly Ser Glu Gly Gly Gly Ser Glu Gly Gly Gly Thr 
355 360 365 



<210> 454 
<211> 152 
<212> PRT 

<213> Unknown Organism 
<220> 

<223> Description of Unknown Organism: MALIA3 protein 
sequence 

<400> 454 

Ser Gly Asp Phe Asp Tyr Glu Lys Met Ala Asn Ala Asn Lys Gly Ala 
15 10 15 

Met Thr Glu Asn Ala Asp Glu Asn Ala Leu Gin Ser Asp Ala Lys Gly 
20 25 30 

Lys Leu Asp Ser Val Ala Thr Asp Tyr Gly Ala Ala He Asp Gly Phe 
35 40 45 
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lie Gly Asp Val Ser Gly Leu Ala Asn Gly Asn Gly Ala Thr Gly Asp 
50 55 60 

Phe Ala Gly Ser Asn Ser Gin Met Ala Gin Val Gly Asp Gly Asp Asn 
65 70 75 . 80^ 

Ser Pro Leu Met Asn Asn Phe Arg Gin Tyr Leu Pro Ser Leu Pro Gin 
85 90 95 

Ser Val Glu Cys Arg Pro Phe Val Phe Ser Ala Gly Lys Pro Tyr Glu 
100 105 110 

Phe Ser He Asp Cys Asp Lys He Asn Leu Phe Arg Gly Val Phe Ala 
115 120 125 

Phe Leu Leu Tyr Val Ala Thr Phe Met Tyr Val Phe Ser Thr Phe Ala 
130 135 140 

Asn He Leu Arg Asn Lys Glu Ser 
145 150 



<210> 455 
<211> 15 
<212> PRT 

<213> Unknown Organism 
<220> 

<223> Description of Unknown Organism: MALI A3 peptide 
sequence 

<400> 455 

Met Pro Val Leu Leu Gly He Pro Leu Leu Leu Arg Phe Leu Gly 
15 10 15 



<210> 456 
<211> 348 
<212> PRT 

<213> Unknown Organism 
<220> 

<223> Description of Unknown Organism: MALIA3 protein 
sequence 

<400> 456 

Met Ala Val Tyr Phe Val Thr Gly Lys Leu Gly Ser Gly Lys Thr Leu 
15 10 15 

Val Ser Val Gly Lys He Gin Asp Lys lie Val Ala Gly Cys Lys He 
20 25 30 

Ala Thr Asn Leu Asp Leu Arg Leu Gin Asn Leu Pro Gin Val Gly Arg 
35 40 45 

Phe Ala Lys Thr Pro Arg Val Leu Arg He Pro Asp Lys Pro Ser He 
50 55 60 
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Ser Asp Leu Leu Ala He Gly Arg Gly Asn Asp Ser Tyr Asp Glu Asn 
65 70 75 80 

Lys Asn Gly Leu Leu Val Leu Asp Glu Cys Gly Thr Trp Phe Asn Thr 
85 90 95 

Arg Ser Trp Asn Asp Lys Glu Arg Gin Pro He He Asp Trp Phe Leu 
100 105 " HO 

His Ala Arg Lys Leu Gly Trp Asp He He Phe Leu Val Gin Asp Leu 
115 120 125 

Ser lie Val Asp Lys Gin Ala Arg Ser Ala Leu Ala Glu His Val Val 
130 135 140 

Tyr Cys Arg Arg Leu Asp Arg He Thr Leu Pro Phe Val Gly Thr Leu 
145 150 155 " 160 

Tyr Ser Leu He Thr Gly Ser Lys Met Pro Leu Pro Lys Leu His Val 
165 170 175 

Gly Val Val Lys Tyr Gly Asp Ser Gin Leu Ser Pro Thr Val Glu Arg 
180 185 190 

Trp Leu Tyr Thr Gly Lys Asn Leu Tyr Asn Ala Tyr Asp Thr Lys Gin 
195 200 205 

Ala Phe Ser Ser Asn Tyr Asp Ser Gly Val Tyr Ser Tyr Leu Thr Pro 
210 215 220 

Tyr Leu Ser His Gly Arg Tyr Phe Lys Pro Leu Asn Leu Gly Gin Lys 
225 230 235 ' 240 

Met Lys Leu Thr Lys He Tyr Leu Lys Lys Phe Ser Arg Val Leu Cys 
245 250 ~ 255 

Leu Ala He Gly Phe Ala Ser Ala Phe Thr Tyr Ser Tyr He Thr Gin 
260 265 270 

Pro Lys Pro Glu Val Lys Lys Val Val Ser Gin Thr Tyr Asp Phe Asp 
275 280 285 

Lys Phe Thr He Asp Ser Ser Gin Arg Leu Asn Leu Ser Tyr Arg Tyr 
290 295 300 

Val Phe Lys Asp Ser Lys Gly Lys Leu He Asn Ser Asp Asp Leu Gin 
305 310 315 ^ " 320 

Lys Gin Gly Tyr Ser Leu Thr Tyr He Asp Leu Cys Thr Val Ser lie 
325 330 335 



Lys Lys Gly Asn Ser Asn Glu He Val Lys Cys Asn 
340 345 



<210> 457 
<211> 24 
<212> DNA 
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<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 
<400> 457 

tggaagaggc acgttctttt cttt 



<210> 458 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 
<400> 458 

cttttctttg ttgccgttgg ggtg 



<210> 459 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 
<400> 459 

acactctccc ctgttgaagc tctt 



<210> 460 
<211> 51 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 
<400> 460 

accgcctcca ccgggcgcgc cttattaaca ctctcccctg ttgaagctct 



<210> 461 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 
<400> 461 

tgaacattct gtaggggcca ctg 



<210> 462 
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<211> 23 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: Primer 
<400> 462 

agagcattct gcaggggcca ctg 

<210> 463 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 
<400> 463 

accgcctcca ccgggcgcgc cttattatga acattctgta ggggccactg 



<210> 464 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 
<400> 464 

accgcctcca ccgggcgcgc cttattaaga gcattctgca ggggccactg 

<210> 465 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 
<400> 465 

cgactggagc acgaggacac tga 

<210> 466 
<211> 26 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 
<400> 466 

ggacactgac atggactgaa ggagta 
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<210> 467 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 467 

gggaggatgg agactgggtc 



<210> 468 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 468 

gggaagatgg agactgggtc 



<210> 469 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 469 

gggagagtgg agactgagtc 



<210> 470 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 470 

gggtgcctgg agactgcgtc 



<210> 471 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<4QQ> 471 

gggtggctgg agactgcgtc 

<210> 472 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 472 

gggaggatgg agactgggtc atctggatgt cttgtgcact gtgacagagg 



<210> 473 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 473 

gggaagatgg agactgggtc atctggatgt cttgtgcact gtgacagagg 



<210> 474 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 474 

gggagagtgg agactgggtc atctggatgt cttgtgcact gtgacagagg 



<210> 475 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 475 

gggtgcctgg agactgggtc atctggatgt cttgtgcact gtgacagagg 
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<210> 476 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 476 

gggtggctgg agactgggtc atctggatgt cttgtgcact gtgacagagg 



<210> 477 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 477 

gggagtctgg agactgggtc atctggatgt cttgtgcact gtgacagagg 



<210> 478 
<211> 42 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 478 

cctctgtcac agtgcacaag acatccagat gacccagtct cc 



<210> 479 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 
<400> 479 

cctctgtcac agtgcacaag ac 22 



<210> 480 
<211> 24 
<212> DNA 
<213> Artificial 



Sequence 
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<220> 

<223> Description of Artificial Sequence: Primer 
<400> 480 

***** . ~ 4 *- 



<210> 481 
<211> 668 
<212> DNA 
<213> Homo sapiens 

<220> 

<221> CDS 

<222> (1) . . (668) 

<400> 481 

agt gca caa gac ate cag atg acc cag tct cca gec acc ctg tct gtg 48 

Ser Ala Gin Asp lie Gin Met Thr Gin Ser Pro Ala Thr Leu Ser Val 
1 5 10 15 

tct cca ggg gaa agg gec acc etc tec tgc agg gee agt cag agt gtt 96 
Ser Pro Gly Glu Arg Ala Thr Leu Ser Cys Arg Ala Ser Gin Ser Val 
20 25 -30 

agt aac aac tta gee tgg tac cag cag aaa cct ggc cag gtt ccc agg 144 
Ser Asn Asn Leu Ala Trp Tyr Gin Gin Lys Pro Gly Gin Val Pro Arg 
35 40 45 

etc etc ate tat ggt gca tec acc agg gee act gat ate cca gee agg 192 
Leu Leu lie Tyr Gly Ala Ser Thr Arg Ala Thr Asp lie Pro Ala Arg 
50 " 55 60 

ttc agt ggc agt ggg tct ggg aca gac ttc act etc acc ate age aga 240 
Phe Ser Gly Ser Gly Ser Gly Thr Asp Phe Thr Leu Thr lie Ser Arg 
65 10 75 80 

ctg gag cct gaa gat ttt gca gtg tat tac tgt cag egg tat ggt age 288 
Leu Glu Pro Glu Asp Phe Ala Val Tyr Tyr Cys Gin Arg Tyr Gly Ser 
85 - 90 95 

tea ccg ggg tgg acg ttc ggc caa ggg acc aag gtg gaa ate aaa cga 336 
Ser Pro Gly Trp Thr Phe Gly Gin Gly Thr Lys Val Glu He Lys Arg 
100 105 110 

act gtg get gca cca tct gtc ttc ate ttc ccg cca tct gat gag cag 384 
Thr Val Ala Ala Pro Ser Val Phe He Phe Pro Pro Ser Asp Glu Gin 
115 120 125 

ttg aaa tct gga act gee tct gtt gtg tgc ctg ctg aat aac ttc tat 4 32 
Leu Lys Ser Gly Thr Ala Ser Val Val Cys Leu Leu Asn Asn Phe Tyr 
130 135 140 

ccc aga gag gee aaa gta cag tgg aag gtg gat aac gee etc caa teg 480 
Pro Arg Glu Ala Lys Val Gin Trp Lys Val Asp Asn Ala Leu Gin Ser 
145 150 155 160 

ggt aac tec cag gag agt gtc aca gag cag gac age aag gac age acc 528 
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Gly Asn Ser Gin Glu Ser 
165 

tac age etc age age ace 
Tyr Ser Leu Ser Ser Thr 
180 

cac aaa gtc tac gee tgc 
His Lys Val Tyr Ala Cys 
195 

gtc aca aag age ttc aac 
Val Thr Lys Ser Phe Asn 
210 



121 

Val Thr Glu Gin Asp Ser 
170 

ctg acg ctg age aaa gca 
Leu Thr Leu Ser Lys Ala 
185 

gaa gtc acc cat cag ggc 
Glu Val Thr His Gin Gly 
200 

aaa gga gag tgt aag ggc 
Lys Gly Glu Cys Lys Gly 
215 220 



Lys Asp Ser Thr 
175 

gac tac gag aaa 576 

Asp Tyr Glu. Lys 

190 

ctg age teg cct 624 
Leu Ser Ser Pro 
205 

gaa ttc gc 668 
Glu Phe Ala 



<210> 482 
<211> 223 
<212> PRT 
<213> Homo sapiens 

<400> 482 „ . 

Ser Ala Gin Asp He Gin Met Thr Gin Ser Pro Ala Thr Leu Ser Val 
X 5 10 15 

Ser Pro Gly Glu Arg Ala Thr Leu Ser Cys Arg Ala Ser Gin Ser Val 
20 25 30 

Ser Asn Asn Leu Ala Trp Tyr Gin Gin Lys Pro Gly Gin Val Pro Arg 
35 40 45 

Leu Leu lie Tyr Gly Ala Ser Thr Arg Ala Thr Asp He Pro Ala Arg 
50 55 60 

Phe Ser Gly Ser Gly Ser Gly Thr Asp Phe Thr Leu Thr lie Ser Arg 
65 70 75 80 

Leu Glu Pro Glu Asp Phe Ala Val Tyr Tyr Cys Gin Arg Tyr Gly Ser 
85 90 95 

Ser Pro Gly Trp Thr Phe Gly Gin Gly Thr Lys Val Glu He Lys Arg 
100 105 HO 

Thr Val Ala Ala Pro Ser Val Phe He Phe Pro Pro Ser Asp Glu Gin 
115 120 125 

Leu Lys Ser Gly Thr Ala Ser Val Val Cys Leu Leu Asn Asn Phe Tyr 
130 135 140 

Pro Arq Glu Ala Lys Val Gin Trp Lys Val Asp Asn Ala Leu Gin Ser 
145 150 155 160 

Glv Asn Ser Gin Glu Ser Val Thr Glu Gin Asp Ser Lys Asp Ser Thr 
165 170 175 

Tyr Ser Leu Ser Ser Thr Leu Thr Leu Ser Lys Ala Asp Tyr Glu Lys 
180 185 190 
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His Lys Val Tyr Ala Cys Glu Val Thr His Gin Gly Leu Ser Ser Pro 
195 200 205 

Val Thr Lys Ser Phe Asn Lys Gly Glu Cys Lys Gly Glu Phe Ala 
oia ->i; 220 



<210> 483 
<211> 13 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 483 
agccaccctg tct 



<210> 484 
<211> 700 
<212> DNA 
<213> Homo sapiens 

<220> 

<221> CDS 

<222> (1) . . (699) 

<400> 484 

agt gca caa gac ate cag atg acc cag tct cct gec acc ctg tct gtg 48 

Ser Ala Gin Asp He Gin Met Thr Gin Ser Pro Ala Thr Leu Ser Val 
1-5 10 15 

tct cca ggt gaa aga gec acc etc tec tgc agg gec agt cag gtg tct 96 
Ser Pro Gly Glu Arg Ala Thr Leu Ser Cys Arg Ala Ser Gin Val Ser 
20 25 30 



cca ggg gaa aga gec acc etc tec tgc aat ctt etc age aac tta gec 
Pro Gly Glu Arg Ala Thr Leu Ser Cys Asn Leu Leu Ser Asn Leu Ala 
35 40 45 

tgg tac cag cag aaa cct ggc cag get ccc agg etc etc ate tat ggt 
Trp Tyr Gin Gin Lys Pro Gly Gin Ala Pro Arg Leu Leu He Tyr Gly 
50 55 60 

get tec acc ggg gec att ggt ate cca gec agg ttc agt ggc agt ggg 
Ala Ser Thr Gly Ala He Gly He Pro Ala Arg Phe Ser Gly Ser Gly 
65 70 75 80 

tct ggg aca gag ttc act etc acc ate age age ctg cag tct gaa gat 
Ser Gly Thr Glu Phe Thr Leu Thr He Ser Ser Leu Gin Ser Glu Asp 
85 90 95 

ttt gca gtg tat ttc tgt cag cag tat ggt acc tea ccg ccc act ttc 
Phe Ala Val Tyr Phe Cys Gin Gin Tyr Gly Thr Ser Pro Pro Thr Phe 
100 105 HO 



144 



192 



240 



288 



336 
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ggc 
Gly 


gga 
Gly 


ggg acc aag 
Gly Thr Lys 
115 


gtg 
Val 


gag 
Glu 


ate 
He 
120 


aaa 
Lys 


cga act 
Arg Thr 


gtg 
Val 


get 
Ala 
125 


gca 
Ala 


cca 
Pro 


tct 
Ser 


384 


gtc 
Val 


ttc 
Phe 
130 


ate 
He 


ttc ccg 
Phe Pro 


cca 
Pro 


tct 
Ser 
135 


gat 
Asp 


gag 
Glu 


cag 
Gin 


ttg 
Leu 


aaa 
Lys 
140 


tct gga act gee 
Ser Gly Thr Ala 


432 


tct 
Ser 
145 


gtt 
Val 


gtg 
Val 


tgc ccg 
Cys Pro 


ctg 
Leu 
150 


aat 
Asn 


aac 
Asn 


ttc 
Phe 


tat ccc 
Tyr Pro 
155 


aga gag gee aaa gta 
Arg Glu Ala Lys Val 
160 


480 


cag 
Gin 


tgg 
Trp 


aag gtg gat 
Lys Val Asp 
165 


aac 
Asn 


gec 
Ala 


etc 
Leu 


caa 
Gin 


teg ggt 
Ser Gly 
170 


aac 
Asn 


tec 
Ser 


cag 
Gin 


gag 
Glu 
175 


agt 
Ser 


528 


gtc 
Val 


aca 
Thr 


gag 
Glu 


cag gac 
Gin Asp 
180 


aac 
Asn 


aag 
Lys 


gac 
Asp 


age 
Ser 
185 


acc 
Thr 


tac 
Tyr 


age 
Ser 


etc 
Leu 


age 
Ser 
190 


age 
Ser 


acc 
Thr 


576 


ctg 
Leu 


acg 
Thr 


ctg 
Leu 
195 


age aaa 
Ser Lys 


gta 
Val 


gac 
Asp 


tac 
Tyr 
200 


gag 
Glu 


aaa 
Lys 


cac 
His 


gaa 
Glu 


gtc 
Val 
205 


tac 
Tyr 


gee 
Ala 


tgc 
Cys 


624 


gaa 
Glu 


gtc 
Val 
210 


acc 
Thr 


cat cag 
His Gin 


ggc 
Gly 


ctt 
Leu 
215 


age 
Ser 


teg 
Ser 


ccc 
Pro 


gtc 
Val 


acg 
Thr 
220 


aag 
Lys 


age 
Ser 


ttc 
Phe 


aac 
Asn 


672 


agg 
Arg 
225 


gga 
Gly 


gag tgt aag 
Glu Cys Lys 


aaa 
Lys 
230 


gaa 
Glu 


ttc 
Phe 


gtt 
Val 


t 














700 



<210> 485 
<211> 233 
<212> PRT 

<213> Homo sapiens 
<400> 485 

Ser Ala Gin Asp He Gin Met Thr Gin Ser Pro Ala Thr Leu Ser Val 
1 5 10 15 

Ser Pro Gly Glu Arg Ala Thr Leu Ser Cys Arg Ala Ser Gin Val Ser 
20 25 30 

Pro Gly Glu Arg Ala Thr Leu Ser Cys Asn Leu Leu Ser Asn Leu Ala 
35 40 45 

Trp Tyr Gin Gin Lys Pro Gly Gin Ala Pro Arg Leu Leu He Tyr Gly 
50 55 60 

Ala Ser Thr Gly Ala He Gly He Pro Ala Arg Phe Ser Gly Ser Gly 
65 70 75 80 

Ser Gly Thr Glu Phe Thr Leu Thr He Ser Ser Leu Gin Ser Glu Asp 
85 90 95 

Phe Ala Val Tyr Phe Cys Gin Gin Tyr Gly Thr Ser Pro Pro Thr Phe 
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100 105 HO 

Gly Gly Gly Thr Lys Val Glu He Lys Arg Thr Val Ala Ala Pro Ser 
115 120 125 

Val Phe He Phe Pro Pro Ser Asp Glu Gin Leu Lys Ser Gly Thr Ala 
130 135 140 

Ser Val Val Cys Pro Leu Asn Asn Phe Tyr Pro Arg Glu Ala Lys Val 
145 150 155 160 

Gin Trp Lys Val Asp Asn Ala Leu Gin Ser Gly Asn Ser Gin Glu Ser 
165 170 175 

Val Thr Glu Gin Asp Asn Lys Asp Ser Thr Tyr Ser Leu Ser Ser Thr 
180 185 190 

Leu Thr Leu Ser Lys Val Asp Tyr Glu Lys His Glu Val Tyr Ala Cys 
195 200 205 

Glu Val Thr His Gin Gly Leu Ser Ser Pro Val Thr Lys Ser Phe Asn 
210 215 220 

Arg Gly Glu Cys Lys Lys Glu Phe Val 
225 230 



<210> 486 
<211> 419 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 3-23 
VH nucleotide sequence 

<220> 

<221> CDS 

<222> (12).. (419) 

<400> 486 

ctgtctgaac g gcc cag ccg gcc atg gcc gaa gtt caa ttg tta gag tct 50 
Ala Gin Pro Ala Met Ala Glu Val Gin Leu Leu Glu Ser 
1 5 10 

ggt ggc ggt ctt gtt cag cct ggt ggt tct tta cgt ctt tct tgc get 
Gly Gly Gly Leu Val Gin Pro Gly Gly Ser Leu Arg Leu Ser Cys Ala 
15 20 25 

get tec gga ttc act ttc tct teg tac get atg tct tgg gtt cgc caa 
Ala Ser Gly Phe Thr Phe Ser Ser Tyr Ala Met Ser Trp Val Arg Gin 
30 35 40 45 

get cct ggt aaa ggt ttg gag tgg gtt tct get ate tct ggt tct ggt 
Ala Pro Gly Lys Gly Leu Glu Trp Val Ser Ala He Ser Gly Ser Gly 
50 55 60 



98 
146 
194 
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ggc agt act tac tat get gac tec gtt aaa ggt cgc ttc act ate tct 242 

Gly Ser Thr Tyr Tyr Ala Asp Ser Val Lys Gly Arg Phe Thr lie Ser 
65 70 75 

aga gac aac tct aag aat act etc tac ttg cag atg aac age tta agg_ 2 90 
Arg Asp Asn Ser Lys Asn Thr Leu Tyr Leu Gin Met Asn Ser Leu Arg 
80 85 90 

get gag gac act gca gtc tac tat tgc get aaa gac tat gaa ggt act 338 
Ala Glu Asp Thr Ala Val Tyr Tyr Cys Ala Lys Asp Tyr Glu Gly Thr 
95 100 105 

ggt tat get ttc gac ata tgg ggt caa ggt act atg gtc acc gtc tct 386 
Gly Tyr Ala Phe Asp He Trp Gly Gin Gly Thr Met Val Thr Val Ser 
HO U5 120 125 

agt gee tec acc aag ggc cca teg gtc ttc ccc 419 
Ser Ala Ser Thr Lys Gly Pro Ser Val Phe Pro 
■ 130 135 



<21.0> 487 
<211> 136 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 3-23 
VH protein sequence 

<400> 487 

Ala Gin Pro Ala Met Ala Glu Val Gin Leu Leu Glu Ser Gly Gly Gly 
1.5 10 15 

Leu Val Gin Pro Gly Gly Ser Leu Arg Leu Ser Cys Ala Ala Ser Gly 
20 25 30 

Phe Thr Phe Ser Ser Tyr Ala Met Ser Trp Val Arg Gin Ala Pro Gly 
35 40 45 

Lys Gly Leu Glu Trp Val Ser Ala He Ser Gly Ser Gly Gly Ser Thr 
50 55 60 

Tyr Tyr Ala Asp Ser Val Lys Gly Arg Phe Thr He Ser Arg Asp Asn 
65 70 75 80 

Ser Lys Asn Thr Leu Tyr Leu Gin Met Asn Ser Leu Arg Ala Glu Asp 
85 90 95 

Thr Ala Val Tyr Tyr Cys Ala Lys Asp Tyr Glu Gly Thr Gly Tyr Ala 
100 ' 105 HO 

Phe Asp He Trp Gly Gin Gly Thr Met Val Thr Val Ser Ser Ala Ser 
115 ** 120 125 

Thr Lys Gly Pro Ser Val Phe Pro 
130 135 
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<210> 488 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 
<400> 488 

ctgtctgaac ggcccagccg 20 

<210> 489 
<211> 83 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 489 

ctgtctgaac ggcccagccg gccatggccg aagttcaatt gttagagtct ggtggcggtc 60 
ttgttcagcc tggtggttct tta 83 

<210> 490 
<211> 54 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 490 

gaaagtgaat ccggaagcag cgcaagaaag acgtaaagaa ccaccaggct gaac 54 

<210> 491 
<211> 42 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 491 

agaaacccac tccaaacctt taccaggagc ttggcgaacc ca 42 

<210> 492 
<211> 94 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 492 ~ 
agtgtcctca gcccttaagc tgttcatctg caagtagaga gtattcttag agttgtctct 60 
agagatagtg aagcgacctt taacggagtc agca 94 

<210> 493 
<211> 81 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 493 

gcttaagggc tgaggacact gcagtctact attgcgctaa agactatgaa ggtactggtt 60 
atgctttcga catatggggt c 81 



<210> 494 
<211> 72 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 494 

ggggaagacc gatgggccct tggtggaggc actagagacg gtgaccatag taccttgacc 60 
tatgtcgaaa gc 72 



<210> 495 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 
<400> 495 

ggggaagacc gatgggccct tgg 23 

<210> 496 
<211> 56 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 
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<220> 

<221> modif ied_base 
<222> (22).. (24) 

<223> A, T, C, G, other or unknown 
<220> 

<221> modifiedjbase 
<222> (28) (30) 

<223> A, T, C, G, other or unknown 
<220> 

<221> modifiedjbase 
<222> (34) (36) 

<223> A, T, C, G, other or unknown 
<220> 

<223> nnn codes for any amino acid but Cys 
<400> 496 

gcttccggat tcactttctc tnnntacnnn atgnnntggg ttcgccaagc tcctgg 56 



<210> 497 
<2U> 68 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modifiedjbase 
<222> (19) . . (21) 
<223> A, T, C or G 

<220> 

<221> modif ied_base 
<222> (25) . . (30) 
<223> A, T, C or G 

<220> 

<221> modifiedjbase 
<222> (40) . . (42) 
<223> A, T, C or G 

<220> 

<221> modifiedjbase 
<222> (46).. (48) 
<223> A, T, C or G 

<400> 497 

ggtttggagt gggtttctnn natcnnnnnn tctggtggcn nnactnnnta tgctgactcc 60 
gttaaagg ' ~ 68 



<210> 498 



WO 02/083872 



PCT/US02/12405 



129 

<211> 912 
<212> DNA 

<213> Escherichia coli 
<400> 498 

tccggagctt cagatctgtt tgcctttttg tggggtggtg cagatcgcgt tacggagaTc 60 
gaccgactgc ttgagcaaaa gccacgctta actgctgatc aggcatggga tgttattcgc 120 
caaaccagtc gtcaggatct taacctgagg ctttttttac ctactctgca agcagcgaca 180 
tctggtttga cacagagcga tccgcgtcgt cagttggtag aaacattaac acgttgggat 240 
ggcatcaatt tgcttaatga tgatggtaaa acctggcagc agccaggctc tgccatcctg 300 
aacgtttggc tgaccagtat gttgaagcgt accgtagtgg ctgccgtacc tatgccattt 360 
gataagtggt acagcgccag tggctacgaa acaacccagg acggcccaac tggttcgctg 420 
aatataagtg ttggagcaaa aattttgtat gaggcggtgc agggagacaa atcaccaatc 480 
ccacaggcgg ttgatctgtt tgctgggaaa ccacagcagg aggttgtgtt ggctgcgctg 54 0 
gaagatacct gggagactct ttccaaacgc tatggcaata atgtgagtaa ctggaaaaca 600 
cctgcaatgg ccttaacgtt ccgggcaaat aatttctttg gtgtaccgca ggccgcagcg 660 
gaagaaacgc gtcatcaggc ggagtatcaa aaccgtggaa cagaaaacga tatgattgtt 720 
ttctcaccaa cgacaagcga tcgtcctgtg cttgcctggg atgtggtcgc acccggtcag 780 
agtgggttta ttgctcccga tggaacagtt gataagcact atgaagatca gctgaaaatg 840 
tacgaaaatt ttggccgtaa gtcgctctgg ttaacgaagc aggatgtgga ggcgcataag 900 
gagtcgtcta ga 912 



<210> 499 
<211> 10 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modif ied_base 
<222> (4) . . (7) 

<223> A, T, C, G, other or unknown 
<400> 499 

gatnnnnatc 10 



<210> 500 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modif ied_base 
<222> (1) . . (15) 

<223> A, T, C, G, other or unknown 



<400> 500 

nnnnnnnnnn nnnnngtccc 



20 
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<210> 501 

<211> 11 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modified_base 

<222> (4) . . (8) 

<223> A, T, C, G, other or unknown 

<400> 501 
gcannnnntg c 

<210> 502 

<211> 10 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modified_base 

<222> (4).. (7) 

<223> A, T, C, G f other or unknown 

<400> 502 
gacnnnngtc 

<210> 503 
<211> 12 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modif ied_base 
<222> (1) , . (7) 

<223> A, T, C, G, other or unknown 

<400> 503 
nnnnnnngcg gg 

<210> 504 
<211> 12 
<212> DNA 
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<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modified_base 
<222> (7).. (12) 

<223> A, T, C, G, other or unknown 

<400> 504 
gtatccnnnn nn 

<210> 505 

<211> 12 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modif ied_base 

<222> (4) . . (9) 

<223> A, T, C, G, other or unknown 

<400> 505 
gcannnnnnt eg 

<210> 506 
<211> 11 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modified_base 
<222> (4) . . (8) 

<223> A, T, C, G, other or unknown 

<400> 506 
geennnnngg c 

<210> 507 
<211> 11 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 



WO 02/083872 



132 

oligonucleotide 

<220> 

<221> modified_base 
<222> (7) . . (11) 

<223> A, T, C, G, other or unknown 

<400> 507 
ggtctcnnnn n 



<210> 508 
<211> 11 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modified_base 
<222> (4) . . (11) 

<223> A, T, C, G, other or unknown 

<400> 508 
gacnnnnngt c 



<210> 509 
<211> 11 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modif ied_base 
<222> (4) . . (8) 

<223> A, T, C, G, other or unknown 

<400> 509 
gacnnnnngt c 



<210> 510 
<211> 12 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modified base 



WO 02/083872 



133 

<222> (4) (9) 

<223> A, T, C, G, other or unknown 

<400> 510 
gacnnnnnng tc 

<210> 511 
<211> 11 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
. oligonucleotide 

<220> 

<221> modifiedjbase 
<222> (4).. (8) 

<223> A, T, C, G, other or unknown 

<400> 511 
ccannnnntg g 

<210> 512 

<211> 15 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modif ied_base 

<222> (1)..(9) 

<223> A, T, C, G, other or unknown 

<400> 512 
nnnnnnnnng caggt 

<210> 513 

<211> 11 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modif ied_base 

<222> (7).. (11) 

<223> A, T, C, G, other or unknown 

<400> 513 
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acctgcnnnn n 



<210> 514 
<211> 13 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> raodified_base 
<222> (5) . . (9) 

<223> A, T, C, G, other or unknown 

<400> 514 
ggccnnnnng gcc 

<210> 515 
<211> 15 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modified_base 
<222> (4) . . (12) 

<223> A, T, C, G, other or unknown 

<400> 515 
ccannnnnnn nntgg 

<210> 516 
<211> 11 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modified_base 
<222> (7) . . (11) 

<223> A, T, C, G, other or unknown 

<400> 516 
cgtctcnnnn n 



<210> 517 
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<211> 12 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modified_base 
<222> (1)..(6) 

<223> A, T, C, G, other or unknown 

<400> 517 
nnnnnngaga eg 

<210> 518 
<211> 16 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modified_base 
<222> (1)..(10) 

<223> A, T, C, G, other or unknown 

<400> 518 
nnnnnnnnnn ctcctc 



<210> 519 
<211> 16 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modified__base 
<222> (7).. (16) 

<223> A, T, C, G, other or unknown 

<400> 519 
gaggagnnnn nnnnnn 

<210> 520 
<211> 11 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modified_base 
<222> (4) . . (8) 

<223> A, T, C, G, other or unknown 

<400> 520 
cctnnnnnag g 

<210> 521 
<211> 12 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modified_base 
<222> (4) . . (9) 

<223> A, T, C, G, other or unknown 

<400> 521 
ccannnnnnt gg 

<210> 522 
<211> 6680 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Vector pCESS 
nucleotide sequence 

<220> 
<221> CDS 

<222> (201) (1058) 

<220> 
<221> CDS 

<222> (2269) .. (2682) 

<220> 
<221> CDS 

<222> (2723) (2866) 

<220> 
<221> CDS 

<222> (3767) (3850) 

<220> 
<221> CDS 
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<222> (4198) . . (5799) 
<400> 522 

gacgaaaggg cctcgtgata cgcctatttt tataggttaa tgtcatgata ataatggttt 60 

cttagacgtc aggtggcact tttcggggaa atgtgcgcgg aacccctatt tgtttattrt 120 

tctaaataca ttcaaatatg tatccgctca tgagacaata accctgataa atgcttcaat 180 

aatattgaaa aaggaagagt atg agt att caa cat ttc cgt gtc gcc ctt att 233 

Met Ser lie Gin His Phe Arg Val Ala Leu He 
15 10 

ccc ttt ttt gcg gca ttt tgc ctt cct gtt ttt get cac cca gaa acg 281 
Pro Phe Phe Ala Ala Phe Cys Leu Pro Val Phe Ala His Pro Glu Thr 
15 20 25 

ctg gtg aaa gta aaa gat get gaa gat cag ttg ggt gcc cga gtg ggt 329 
Leu Val Lys Val Lys Asp Ala Glu Asp Gin Leu Gly Ala Arg Val Gly 
30 35 40 

tac ate gaa ctg gat etc aac age ggt aag ate ctt gag agt ttt cgc 377 
Tyr He Glu Leu Asp Leu Asn Ser Gly Lys He Leu Glu Ser Phe Arg 
4 5 50 55 

ccc gaa gaa cgt ttt cca atg atg age act ttt aaa gtt ctg eta tgt 425 
Pro Glu Glu Arg Phe Pro Met Met Ser Thr Phe Lys Val Leu Leu Cys 
60 65 70 75 

ggc gcg gta tta tec cgt att gac gcc ggg caa gag caa etc ggt cgc 473 
Gly Ala Val Leu Ser Arg He Asp Ala Gly Gin Glu Gin Leu Gly Arg 
80 85 90 

cgc ata cac tat tct cag aat gac ttg gtt gag tac tea cca gtc aca 521 
Arg He His Tyr Ser Gin Asn Asp Leu Val Glu Tyr Ser Pro Val Thr 
95 100 105 

gaa aag cat ctt acg gat ggc atg aca gta aga gaa tta tgc agt get 569 
Glu Lys His Leu Thr Asp Gly Met Thr Val Arg Glu Leu Cys Ser Ala 
110 115 120 

gcc ata ace atg agt gat aac act gcg gcc aac tta ctt ctg aca acg 617 
Ala He Thr Met Ser Asp Asn Thr Ala Ala Asn Leu Leu Leu Thr Thr 
125 130 135 

ate gga gga ccg aag gag eta ace get ttt ttg cac aac atg ggg gat 665 
He Gly Gly Pro Lys Glu Leu Thr Ala Phe Leu His Asn Met Gly Asp 
140 145 150 155 

cat gta act cgc ctt gat cgt tgg gaa ccg gag ctg aat gaa gcc ata 713 
His Val Thr Arg Leu Asp Arg Trp Glu Pro Glu Leu Asn Glu Ala He 
160 165 170 

cca aac gac gag cgt gac ace acg atg cct gta gca atg gca aca acg 761 
Pro Asn Asp Glu Arg Asp Thr Thr Met Pro Val Ala Met Ala Thr Thr 
175 180 185 

ttg cgc aaa eta tta act ggc gaa eta ctt act eta get tec egg caa 809 
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Leu Arg Lys Leu Leu Thr Gly Glu Leu Leu Thr Leu Ala Ser Arg Gin 
190 195 200 

caa tta ata gac tgg atg gag gcg gat aaa gtt gca gga cca ctt ctg 857 
Gin Leu lie Asp Trp Met Glu Ala Asp Lys Val Ala Gly Pro Leu Leu 
205 210 215 ~ 

cgc teg gec ctt ccg get ggc tgg ttt att get gat aaa tct gga gec 905 
Arg Ser Ala Leu Pro Ala Gly Trp Phe lie Ala Asp Lys Ser Gly Ala 
220 225 ' " 230 235 

ggt gag cgt ggg tct cgc ggt ate att gca gca ctg ggg cca gat ggt 953 
Gly Glu Arg Gly Ser Arg Gly lie lie Ala Ala Leu Gly Pro Asp Gly 
240 245 250 

aag ccc tec cgt ate gta gtt ate tac acg acg ggg agt cag gca act 1001 
Lys Pro Ser Arg He Val Val He Tyr Thr Thr Gly Ser Gin Ala Thr 
255 260 265 

atg gat gaa cga aat aga cag ate get gag ata ggt gee tea ctg att 1049 
Met Asp Glu Arg Asn Arg Gin He Ala Glu He Gly Ala Ser Leu He 
270 275 280 



aag cat tgg taactgtcag accaagttta ctcatatata ctttagattg 1098 
Lys His Trp 
285 



atttaaaact 


tcatttttaa 


tttaaaagga 


tctaggtgaa 


gatccttttt 


gataatctca 


1158 


tgaccaaaat 


cccttaacgt 


gagttttcgt 


tccactgagc 


gtcagacccc 


gtagaaaaga 


1218 


tcaaaggatc 


ttcttgagat 


cctttttttc 


tgcgcgtaat 


ctgctgcttg 


caaacaaaaa 


1278 


aaccaccgct 


accageggtg 


gtttgtttgc 


eggatcaaga 


gctaccaact 


ctttttccga 


1338 


aggtaactgg 


cttcagcaga 


gegcagatae 


caaatactgt 


ccttctagtg 


tagcegtagt 


1398 


taggccacca 


cttcaagaac 


tctgtagcac 


cgcctacata 


cctcgctctg 


ctaatcctgt 


1458 


taccagtggc 


tgctgccagt 


ggcgataagt 


cgtgtcttac 


cgggttggac 


tcaagacgat 


1518 


agttaccgga 


taaggcgcag 


eggteggget 


gaacgggggg 


ttcgtgcata 


cagcccagct 


1578 


tggagegaac 


gacctacacc 


gaactgagat 


acctacagcg 


tgagcattga 


gaaagegeca 


1638 


cgcttcccga 


agggagaaag 


gcggacaggt 


ateeggtaag 


eggcagggtc 


ggaacaggag 


1698 


agegcacgag 


ggagcttcca 


gggggaaacg 


cctggtatct 


ttatagtcct 


gtcgggtttc 


1758 


gccacctctg 


acttgagegt 


cgatttttgt 


gatgetegtc 


aggggggegg 


agcctatgga 


1818 


aaaacgccag 


caacgcggcc 


tttttacggt 


tcctggcctt 


ttgctggcct 


tttgetcaca 


1878 


tgttctttcc 


tgcgttatcc 


cctgattctg 


tggataaccg 


tattaccgcc 


tttgagtgag 


1938 


ctgataccgc 


tcgccgcagc 


cgaacgaccg 


agegcagega 


gtcagtgagc 


gaggaagegg 


1998 


aagagcgccc 


aatacgcaaa 


ccgcctctcc 


ccgcgcgttg 


gecgattcat 


taatgeaget 


2058 
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ggcacgacag gtttcccgac tggaaagcgg gcagtgagcg caacgcaatt aatgtgagtt 2118 

agctcactca ttaggcaccc caggctttac actttatgct tccggctcgt atgttgtgtg 2178 

gaattgtgag cggataacaa tttcacacag gaaacagcta tgaccatgat tacgecaagc 2238 

tttggagcct tttttttgga gattttcaac gtg aaa aaa tta tta ttc gca att 2292 

Met Lys Lys Leu Leu Phe Ala lie 
290 

cct tta gtt gtt cct ttc tat tct cac agt gca cag gtc caa ctg cag 2340 
Pro Leu Val Val Pro Phe Tyr Ser His Ser Ala Gin Val Gin Leu Gin 
295 300 305 310 

gtc gac etc gag ate aaa cgt gga act gtg get gca cca tct gtc ttc 2388 
Val Asp Leu Glu lie Lys Arg Gly Thr Val Ala Ala Pro Ser Val Phe 
315 320 325 

ate ttc ccg cca tct gat gag cag ttg aaa tct gga act gec tct gtt 2436 
He Phe Pro Pro Ser Asp Glu Gin Leu Lys Ser Gly Thr Ala Ser Val 
330 335 340 

gtg tgc ctg ctg aat aac ttc tat ccc aga gag gee aaa gta cag tgg 2484 
Val Cys Leu Leu Asn Asn Phe Tyr Pro Arg Glu Ala Lys Val Gin Trp 
345 350 355 

aag gtg gat aac gee etc caa teg ggt aac tec cag gag agt gtc aca 2 532 
Lys Val Asp Asn Ala Leu Gin Ser Gly Asn Ser Gin Glu Ser Val Thr 
360 365 370 

gag cag gac age aag gac age acc tac age etc age age acc ctg acg 2580 
Glu Gin Asp Ser Lys Asp Ser Thr Tyr Ser Leu Ser Ser Thr Leu Thr 
375 380 385 390 

ctg age aaa gca gac tac gag aaa cac aaa gtc tac gee tgc gaa gtc 2628 
Leu Ser Lys Ala Asp Tyr Glu Lys His Lys Val Tyr Ala Cys Glu Val 
395 400 405 

acc cat cag ggc ctg agt tea ccg gtg aca aag age ttc aac agg gga 2676 
Thr His Gin Gly Leu Ser Ser Pro Val Thr Lys Ser Phe Asn Arg Gly 
410 415 420 

gag tgt taataaggcg cgccaattct atttcaagga gacagtcata atg aaa tac 2731 
Glu Cys Met Lys Tyr 

425 

eta ttg cct acg gca gee get gga ttg tta tta etc gcg gec cag ccg 2779 
Leu Leu Pro Thr Ala Ala Ala Gly Leu Leu Leu Leu Ala Ala Gin Pro 
430 435 440 

gee atg gee gaa gtt caa ttg tta gag tct ggt ggc ggt ctt gtt cag 2827 
Ala Met Ala Glu Val Gin Leu Leu Glu Ser Gly Gly Gly Leu Val Gin 
445 450 455 

cct ggt ggt tct tta cgt ctt tct tgc get get tec gga gcttcagatc 2876 
Pro Gly Gly Ser Leu Arg Leu Ser Cys Ala Ala Ser Gly 
460 465 470 
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tgtttgcctt 


tttgtggggt 


ggtgcagatc 


gcgttacgga 


gatcgaccga 


ctgcttgagc 


2936 


aaaagccacg 


cttaactgct 


gatcaggcat 


gggatgttat 


tcgccaaacc 


agtcgtcagg 


2996 


atcttaacct 


gaggcttttt 


ttacctactc 


tgcaagcagc 


gacatctggt 


ttgacacaga 


3056 


gcgatccgcg 


tcgtcagttg 


gtagaaacat 


taacacgttg 


ggatggcatc 


aatttgctta 


3116 


atgatgatgg 


taaaacctgg 


cagcagccag 


gctctgccat 


cctgaacgtt 


tggctgacca 


3176 


gtatgttgaa 


gcgtaccgta 


gtggctgccg 


tacctatgcc 


atttgataag 


tggtacagcg 


3236 


ccagtggcta 


cgaaacaacc 


caggacggcc 


caactggttc 


gctgaatata 


agtgttggag 


3296 


caaaaatttt 


gtatgaggcg 


gtgcagggag 


acaaatcacc 


aatcccacag 


gcggttgatc 


3356 


tgtttgctgg 


gaaaccacag 


caggaggttg 


tgttggctgc 


gctggaagat 


acctgggaga 


3416 


ctctttccaa 


acgctatggc 


aataatgtga 


gtaactggaa 


aacacctgca 


atggccttaa 


3476 


cgttccgggc 


aaataatttc 


tttggtgtac 


cgcaggccgc 


agcggaagaa 


acgcgtcatc 


3536 


aggcggagta 


tcaaaaccgt 


ggaacagaaa 


acgatatgat 


tgttttctca 


ccaacgacaa 


3596 


gcgatcgtcc 


tgtgcttgcc 


tgggatgtgg 


tcgcacccgg 


tcagagtggg 


tttattgctc 


3656 


ccgatggaac 


agttgataag 


cactatgaag 


atcagctgaa 


aatgtacgaa 


aattttggcc 


3716 


gtaagtcgct 


ctggttaacg 


aagcaggatg 


tggaggcgca 


taaggagtcg 


tct aga 
Ser Arg 


3772 



gac aac tct aag aat act etc tac ttg cag atg aac age tta agt ctg 3820 
Asp Asn Ser Lys Asn Thr Leu Tyr Leu Gin Met Asn Ser Leu Ser Leu 
475 480 485 490 

age att egg tec ggg caa cat tct cca aac tgaccagacg acacaaaegg 3870 
Ser lie Arg Ser Gly Gin His Ser Pro Asn 
495 500 

ettaegctaa atcccgcgca tgggatggta aagaggtggc gtctttgctg gcctggactc 3930 

atcagatgaa ggccaaaaat tggcaggagt ggacacagca ggcagcgaaa caagcactga 3990 

ccatcaactg gtactatget gatgtaaacg gcaatattgg ttatgttcat actggtgctt 4050 

atccagatcg tcaatcaggc catgatccgc gattacccgt tcctggtacg ggaaaatggg 4110 

actggaaagg getattgect tttgaaatga accctaaggt gtataacccc cagaagctag 4170 

cctgcggctt cggtcaccgt ctcaagc gec tec ace aag ggc cca teg gtc ttc 4224 

Ala Ser Thr Lys Gly Pro Ser Val Phe 
505 

ccc ctg gca ccc tec tec aag age acc tct ggg ggc aca gcg gee ctg 4272 
Pro Leu Ala Pro Ser Ser Lys Ser Thr Ser Gly Gly Thr Ala Ala Leu 
510 515 520 525 



WO 02/083872 



PCT/US02/12405 



141 

ggc tgc ctg gtc aag gac tac ttc ccc gaa ccg gtg acg gtg teg tgg 4320 

Gly Cys Leu Val Lys Asp Tyr Phe Pro Glu Pro Val Thr Val Ser Trp 

530 535 540 

aac tea ggc gec ctg acc age ggc gtc cac acc ttc ccg get gtc eta 4368 
Asn Ser Gly Ala Leu Thr Ser Gly Val His Thr Phe Pro Ala Val Leu~ 
545 550 555 

cag tec tea gga etc tac tec etc age age gta gtg acc gtg ccc tec 4416 
Gin Ser Ser Gly Leu Tyr Ser Leu Ser Ser Val Val Thr Val Pro Ser 
560 565 570 

age age ttg ggc acc cag acc tac ate tgc aac gtg aat cac aag ccc 4464 
Ser Ser Leu Gly Thr Gin Thr Tyr lie Cys Asn Val Asn His Lys Pro 
575 580 585 

age aac acc aag gtg gac aag aaa gtt gag ccc aaa tct tgt gcg gee 4512 
Ser Asn Thr Lys Val Asp Lys Lys Val Glu Pro Lys Ser Cys Ala Ala 
590 595 600 605 

gca cat cat cat cac cat cac ggg gee gca gaa caa aaa etc ate tea 4560 
Ala His His His His His His Gly Ala Ala Glu Gin Lys Leu He Ser 
610 615 620 

gaa gag gat ctg aat ggg gee gca tag act gtt gaa agt tgttta gca 4608 
Glu Glu Asp Leu Asn Gly Ala Ala Thr Val Glu Ser Cys Leu Ala 

625 630 635 

aaa cct cat aca gaa aat tea ttt act aac gtc tgg aaa gac gac aaa 4656 
Lys Pro His Thr Glu Asn Ser Phe Thr Asn Val Trp Lys Asp Asp Lys 
640 645 650 

act tta gat cgt tac get aac tat gag ggc tgt ctg tgg aat get aca 4704 
Thr Leu Asp Arg Tyr Ala Asn Tyr Glu Gly Cys Leu Trp Asn Ala Thr 
655 660 665 

ggc gtt gtg gtt tgt act ggt gac gaa act cag tgt tac ggt aca tgg 4752 
Gly Val Val Val Cys Thr Gly Asp Glu Thr Gin Cys Tyr Gly Thr Trp 
670 675 680 

gtt cct att ggg ctt get ate cct gaa aat gag ggt ggt ggc tct gag 4 800 
Val Pro He Gly Leu Ala He Pro Glu Asn Glu Gly Gly Gly Ser Glu 
685 690 695 700 

ggt ggc ggt tct gag ggt ggc ggt tct gag ggt ggc ggt act aaa cct 4848 
Gly Gly Gly Ser Glu Gly Gly Gly Ser Glu Gly Gly Gly Thr Lys Pro 
705 710 715 

cct gag tac ggt gat aca cct att ccg ggc tat act tat ate aac cct 4896 
Pro Glu Tyr Gly Asp Thr Pro He Pro Gly Tyr Thr Tyr He Asn Pro 
720 725 730 

etc gac ggc act tat ccg cct ggt act gag caa aac ccc get aat cct 4944 
Leu Asp Gly Thr Tyr Pro Pro Gly Thr Glu Gin Asn Pro Ala Asn Pro 
735 ' 740 745 

aat cct tct ctt gag gag tct cag cct ctt aat act ttc atg ttt cag 4992 
Asn Pro Ser Leu Glu Glu Ser Gin Pro Leu Asn Thr Phe Met Phe Gin 
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750 755 760 

aat aat agg ttc cga aat agg cag ggt gca tta act gtt tat acg ggc 504 0 
Asn Asn Arg Phe Arg Asn Arg Gin Gly Ala Leu Thr Val Tyr Thr Gly 
765 770 775 780 

act gtt act caa ggc act gac ccc gtt aaa act tat tac cag tac act 5088 
Thr Val Thr Gin Gly Thr Asp Pro Val Lys Thr Tyr Tyr Gin Tyr Thr 
785 790 795 

cct gta tea tea aaa gec atg tat gac get tac tgg aac ggt aaa ttc 5136 
Pro Val Ser Ser Lys Ala Met Tyr Asp Ala Tyr Trp Asn Gly Lys Phe 
800 805 810 

aga gac tgc get ttc cat tct ggc ttt aat gag gat cca ttc gtt tgt 5184 
Arg Asp Cys Ala Phe His Ser Gly Phe Asn Glu Asp Pro Phe Val Cys 
815 820 825 

gaa tat caa ggc caa teg tct gac ctg cct caa cct cct gtc aat get 5232 
Glu Tyr Gin Gly Gin Ser Ser Asp Leu Pro Gin Pro Pro Val Asn Ala 
830 835 840 

ggc ggc ggc tct ggt ggt ggt tct ggt ggc ggc tct gag ggt ggc ggc 5280 
Gly Gly Gly Ser Gly Gly Gly Ser Gly Gly Gly Ser Glu Gly Gly Gly 
845 850 855 860 

tct gag ggt ggc ggt tct gag ggt ggc ggc tct gag ggt ggc ggt tec 5328 
Ser Glu Gly Gly Gly Ser Glu Gly Gly Gly Ser Glu Gly Gly Gly Ser 
865 870 875 

ggt ggc ggc tec ggt tec ggt gat ttt gat tat gaa aaa atg gca aac 5376 
Gly Gly Gly Ser Gly Ser Gly Asp Phe Asp Tyr Glu Lys Met Ala Asn 
880 885 890 

get aat aag ggg get atg ace gaa aat gee gat gaa aac gcg eta cag 5424 
Ala Asn Lys Gly Ala Met Thr Glu Asn Ala Asp Glu Asn Ala Leu Gin 
895 900 905 

tct gac get aaa ggc aaa ctt gat tct gtc get act gat tac ggt get 5472 
Ser Asp Ala Lys Gly Lys Leu Asp Ser Val Ala Thr Asp Tyr Gly Ala 
910 915 920 

get ate gat ggt ttc att ggt gac gtt tec ggc ctt get aat ggt aat 5520 
Ala He Asp Gly Phe He Gly Asp Val Ser Gly Leu Ala Asn Gly Asn 
925 930 935 940 

ggt get act ggt gat ttt get ggc tct aat tec caa atg get caa gtc 5568 
Gly Ala Thr Gly Asp Phe Ala Gly Ser Asn Ser Gin Met Ala Gin Val 
945 950 955 

ggt gac ggt gat aat tea cct tta atg aat aat ttc cgt caa tat tta 5616 
Gly Asp Gly Asp Asn Ser Pro Leu Met Asn Asn Phe Arg Gin Tyr Leu 
960 965 970 

cct tct ttg cct cag teg gtt gaa tgt cgc cct tat gtc ttt ggc get 5664 
Pro Ser Leu Pro Gin Ser Val Glu Cys Arg Pro Tyr Val Phe Gly Ala 
975 980 985 
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ggt aaa cca tat gaa ttt tct att gat tgt gac aaa ata aac tta ttc 5712 

Gly Lys Pro Tyr Glu Phe Ser He Asp Cys Asp Lys He Asn Leu Phe 

990 995 1000 

cgt ggt gtc ttt gcg ttt ctt tta tat gtt gcc acc ttt atg tat gta 5760 

Arg Gly Val Phe Ala Phe Leu Leu Tyr Val Ala Thr Phe Met Tyr" Val~ 
1005 1010 1015 1020 

ttt teg acg ttt get aac ata ctg cgt aat aag gag tct taataagaat 5809 

Phe Ser Thr Phe Ala Asn He Leu Arg Asn Lys Glu Ser 





1025 • 




1030 






tcactggccg 


tegttttaca 


aegtegtgae 


tgggaaaacc ctggcgttac 


ccaacttaat 


5869 


cgccttgcag 


cacatccccc 


tttcgccagc 


tggegtaata gegaagagge 


ccgcaccgat 


5929 


cgcccttccc 


aacagttgcg 


cagectgaat 


ggcgaatggc gectgatgeg 


gtattttctc 


5989 


cttacgcatc 


tgtgcggtat 


ttcacaccgc 


atataaattg taaacgttaa 


tattttgtta 


6049 


aaattcgcgt 


taaatttttg 


ttaaatcagc 


tcatttttta accaataggc 


egaaategge 


6109 


aaaatccctt 


ataaatcaaa 


agaatagece 


gagatagggt tgagtgttgt 


tccagtttgg 


6169 


aacaagagtc 


cactattaaa 


gaacgtggac 


tccaacgtca aagggcgaaa 


aacegtctat 


6229 


cagggegatg 


gcccactacg 


tgaaccatca 


cccaaatcaa gttttttggg 


gtcgaggtgc 


6289 


cgtaaagcac 


taaatcggaa 


ccctaaaggg 


agcccccgat ttagagcttg 


aeggggaaag 


6349 


ccggcgaacg 


tggcgagaaa 


ggaagggaag 


aaagcgaaag gagegggege 


tagggegctg 


6409 


gcaagtgtag 


cggtcacgct 


gcgcgtaacc 


accacacccg ccgcgcttaa 


tgcgccgcta 


6469 


cagggegegt 


actatggttg 


etttgaeggg 


tgeagtctea gtacaatctg 


ctctgatgcc 


6529 


gcatagttaa 


gccagccccg 


acacccgcca 


acacccgctg acgcgccctg 


aegggcttgt 


6589 


ctgctcccgg 


catccgctta 


cagacaagct 


gtgacegtet ccgggagctg 


catgtgtcag 


6649 


aggttttcac 


cgtcatcacc 


gaaacgcgcg 


a 




6680 



<210> 523 
<211> 286 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Vector pCES5 
protein sequence 

<400> 523 

Met Ser He Gin His Phe Arg Val Ala Leu He Pro Phe Phe Ala Ala 
15 10 15 



Phe Cys Leu Pro Val Phe Ala His Pro Glu Thr Leu Val Lys Val Lys 
20 25 30 
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Asp Ala Glu Asp Gin Leu Gly Ala Arg Val Gly Tyr lie Glu Leu Asp 
35 40 45 

Leu Asn Ser Gly Lys lie Leu Glu Ser Phe Arg Pro Glu Glu Arg Phe 
50 55 60 

Pro Met Met Ser Thr Phe Lys Val Leu Leu Cys Gly Ala Val Leu Ser 
65 70 75 80 

Arg He Asp Ala Gly Gin Glu Gin Leu Gly Arg Arg He His Tyr Ser 
85 90 95 

Gin Asn Asp Leu Val Glu Tyr Ser Pro Val Thr Glu Lys His Leu Thr 
100 105 110 

Asp Gly Met Thr Val Arg Glu Leu Cys Ser Ala Ala He Thr Met Ser 
115 120 125 

Asp Asn Thr Ala Ala Asn Leu Leu Leu Thr Thr He Gly Gly Pro Lys 
130 135 140 

Glu Leu Thr Ala Phe Leu His Asn Met Gly Asp His Val Thr Arg Leu 
145 150 155 160 

Asp Arg Trp Glu Pro Glu Leu Asn Glu Ala He Pro Asn Asp Glu Arg 
165 170 " 175 

Asp Thr Thr Met Pro Val Ala Met Ala Thr Thr Leu Arg Lys Leu Leu 
180 185 190 

Thr Gly Glu Leu Leu Thr Leu Ala Ser Arg Gin Gin Leu He Asp Trp 
195 - 200 205 

Met Glu Ala Asp Lys Val Ala Gly Pro Leu Leu Arg Ser Ala Leu Pro 
210 215 220 

Ala Gly Trp Phe He Ala Asp Lys Ser Gly Ala Gly Glu Arg Gly Ser 
225 230 235 240 

Arg Gly He He Ala Ala Leu Gly Pro Asp Gly Lys Pro Ser Arg He 
245 250 255 

Val Val He Tyr Thr Thr Gly Ser Gin Ala Thr Met Asp Glu Arg Asn 
260 265 270 

Arg Gin He Ala Glu He Gly Ala Ser Leu He Lys His Trp 
275 280 285 



<210> 524 
<211> 138 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Vector pCES5 
protein sequence 
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<400> 524 

Met Lys Lys Leu Leu Phe Ala He Pro Leu Val Val Pro Phe Tyr Ser 
15 10 15 

His Ser Ala Gin Val Gin Leu Gin Val Asp Leu Glu He Lys Arg Gly 
20 25 30 

Thr Val Ala Ala Pro Ser Val Phe He Phe Pro Pro Ser Asp Glu Gin 
35 40 45 

Leu Lys Ser Gly Thr Ala Ser Val Val Cys Leu Leu Asn Asn Phe Tyr 
50 55 60 

Pro Arg Glu Ala Lys Val Gin Trp Lys Val Asp Asn Ala Leu Gin Ser 
65 * 70 75 80 

Gly Asn Ser Gin Glu Ser Val Thr Glu Gin Asp Ser Lys Asp Ser Thr 
85 90 95 

Tyr Ser Leu Ser Ser Thr Leu Thr Leu Ser Lys Ala Asp Tyr Glu Lys 
100 105 110 

His Lys Val Tyr Ala Cys Glu Val Thr His Gin Gly Leu Ser Ser Pro 
115 120 125 

Val Thr Lys Ser Phe Asn Arg Gly Glu Cys 
130 135 



<210> 525 
<211> 48 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Vector pCES5 
protein sequence 

<400> 525 

Met Lys Tyr Leu Leu Pro Thr Ala Ala Ala Gly Leu Leu Leu Leu Ala 
15 10 15 

Ala Gin Pro Ala Met Ala Glu Val Gin Leu Leu Glu Ser Gly Gly Gly 
20 25 30 

Leu Val Gin Pro Gly Gly Ser Leu Arg Leu Ser Cys Ala Ala Ser Gly 
35 40 45 



<210> 526 
<211> 28 
<212> PRT 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: Vector pCES5 
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protein sequence 
<400> 526 

Ser Arg Asp Asn Ser Lys Asn Thr Leu Tyr Leu Gin Met Asn Ser Leu 
1 5 10 15 

Ser Leu Ser lie Arg Ser Gly Gin His Ser Pro Asn 
20 25 



<210> 527 

<211> 533 

<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Vector pCES5 
protein sequence 

<400> 527 

Ala Ser Thr Lys Gly Pro Ser Val Phe Pro Leu Ala Pro Ser Ser Lys 
1 5 10 15 

Ser Thr Ser Gly Gly Thr Ala Ala Leu Gly Cys- Leu Val Lys Asp Tyr 
20 25 30 

Phe Pro Glu Pro Val Thr Val Ser Trp Asn Ser Gly Ala Leu Thr Ser 
35 40 * 45 

Gly Val His Thr Phe Pro Ala Val Leu Gin Ser Ser Gly Leu Tyr Ser 
50 55 60 

Leu Ser Ser Val Val Thr Val Pro Ser Ser Ser Leu Gly Thr Gin Thr 
65 70 75 80 

Tyr He Cys Asn Val Asn His Lys Pro Ser Asn Thr Lys Val Asp Lys 
85 90 95 

Lys Val Glu Pro Lys Ser Cys Ala Ala Ala His His His His His His 
100 105 110 

Gly Ala Ala Glu Gin Lys Leu He Ser Glu Glu Asp Leu Asn Gly Ala 
115 120 125 

Ala Thr Val Glu Ser Cys Leu Ala Lys Pro His Thr Glu Asn Ser Phe 
130 135 140 

Thr Asn Val Trp Lys Asp Asp Lys Thr Leu Asp Arg Tyr Ala Asn Tyr 
145 150 155 ^ 160 

Glu Gly Cys Leu Trp Asn Ala Thr Gly Val Val Val Cys Thr Gly Asp 
165 170 175 

Glu Thr Gin Cys Tyr Gly Thr Trp Val Pro lie Gly Leu Ala He Pro 
180 185 190 

Glu Asn Glu Gly Gly Gly Ser Glu Gly Gly Gly Ser Glu Gly Gly Gly 
195 200 205 
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Ser Glu Gly Gly Gly Thr Lys Pro Pro Glu Tyr Gly Asp Thr Pro He 
210 215 220 

Pro Gly Tyr Thr Tyr He Asn Pro Leu Asp Gly Thr Tyr Pro Pro Gly 
225 230 235 240~ 

Thr Glu Gin Asn Pro Ala Asn Pro Asn Pro Ser Leu Glu Glu Ser Gin 
245 250 255 

Pro Leu Asn Thr Phe Met Phe Gin Asn Asn Arg Phe Arg Asn Arg Gin 
260 265 270 

Gly Ala Leu Thr Val Tyr Thr Gly Thr Val Thr Gin Gly Thr Asp Pro 
275 280 285 

Val Lys Thr Tyr Tyr Gin Tyr Thr Pro Val Ser Ser Lys Ala Met Tyr 
290 295 300 

Asp Ala Tyr Trp Asn Gly Lys Phe Arg Asp Cys Ala Phe His Ser Gly 
305 310 315 320 

Phe Asn Glu Asp Pro Phe Val Cys Glu Tyr Gin Gly Gin Ser Ser Asp 
325 330 335 

Leu Pro Gin Pro Pro Val Asn Ala Gly Gly Gly Ser Gly Gly Gly Ser 
340 345 350 

Gly Gly Gly Ser Glu Gly Gly Gly Ser Glu Gly Gly Gly Ser Glu Gly 
355 360 365 

Gly Gly Ser Glu Gly Gly Gly Ser Gly Gly Gly Ser Gly Ser Gly Asp 
370 375 380 

Phe Asp Tyr Glu Lys Met Ala Asn Ala Asn Lys Gly Ala Met Thr Glu 
385 390 395 400 

Asn Ala Asp Glu Asn Ala Leu Gin Ser Asp Ala Lys Gly Lys Leu Asp 
405 410 415 

Ser Val Ala Thr Asp Tyr Gly Ala Ala He Asp Gly Phe He Gly Asp 
420 425 430 

Val Ser Gly Leu Ala Asn Gly Asn Gly Ala Thr Gly Asp Phe Ala Gly 
435 440 445 

Ser Asn Ser Gin Met Ala Gin Val Gly Asp Gly Asp Asn Ser Pro Leu 
450 455 460 

Met Asn Asn Phe Arg Gin Tyr Leu Pro Ser Leu Pro Gin Ser Val Glu 
465 470 475 480 

Cys Arg Pro Tyr Val Phe Gly Ala Gly Lys Pro Tyr Glu Phe Ser He 
485 490 495 



Asp Cys Asp Lys lie Asn Leu Phe Arg Gly Val Phe Ala Phe Leu Leu 
500 505 510 
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Tyr Val Ala Thr Phe Met Tyr Val Phe Ser Thr Phe Ala Asn lie Leu 
515 520 525 

Arg Asn Lys Glu Ser 
530 



<210> 528 
<211> 30 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 528 

acctcactgg cttccggatt cactttctct 30 



<210> 529 
<211> 42 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleot ide 

<400> 529 

agaaacccac tccaaacctt taccaggagc ttggcgaacc ca 4 2 

<210> 530 
<211> 51 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 530 

ggaaggcagt gatctagaga tagtgaagcg acctttaacg gagtcagcat a 51 

<210> 531 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 



<400> 531 

ggaaggcagt gatctagaga tag 



23 
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<210> 532 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 532 

gtgctgactc agccaccctc 



<210> 533 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 533 

gccctgactc agcctgcctc 



<210> 534 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 534 

gagctgactc aggaccctgc 



<210> 535 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 535 

gagctgactc agccaccctc 

<210> 536 
<211> 38 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 536 

cctcgacagc gaagtgcaca gagcgtcttg actcagcc 

<210> 537 

<211> 30 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 537 

cctcgacagc gaagtgcaca gagcgtcttg 

<210> 538 

<211> 38 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 538 

cctcgacagc gaagtgcaca gagcgctttg actcagcc 

<210> 539 
<211> 30 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 539 

cctcgacagc gaagtgcaca gagcgctttg 

<210> 540 
<211> 38 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 540 

cctcgacagc taagtgcaca gagcgctttg actcagcc 
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<210> 541 
<211> 30 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 541 

cctcgacagc gaagtgcaca gagcgctttg 



<210> 542 
<211> 38 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 542 

cctcgacagc gaagtgcaca gagcgaattg actcagcc 



<210> 543 
<211> 30 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 543 

cctcgacagc gaagtgcaca gagcgaattg 



<210> 544 
<211> 38 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 544 

cctcgacagc gaagtgcaca gtacgaattg actcagcc 



<210> 545 
<211> 30 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 545 

cctcgacagc gaagtgcaca gtacgaattg 

<210> 546 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 546 

cctcgacagc gaagtgcaca g 

<210> 547 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 547 

ccgtgtatta ctgtgcgaga g 

<210> 548 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 548 

ctgtgtatta ctgtgcgaga g 

<210> 549 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 



<400> 549 
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ccgtatatta ctgtgcgaaa g 



<210> 550 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 550 

ctgtgtatta ctgtgcgaaa g 



<210> 551 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 551 

ctgtgtatta ctgtgcgaga c 



<210> 552 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
ol igonucleotide 

<400> 552 

ccatgtatta ctgtgcgaga c 



<210> 553 
<211> 94 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 553 

ggtgtagtga tctagtgaca actctaagaa tactctctac ttgcagatga acagctttag 60 
ggctgaggac actgcagtct actattgtgc gaga 94 



<210> 554 
<211> 94 



WO 02/083872 



PCT/US02/12405 



154 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 554 

ggtgtagtga tctagtgaca actctaagaa tactctctac ttgcagatga acagctttag 60 
ggctgaggac actgcagtct actattgtgc gaaa 94 



<210> 555 
<211> 85 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 555 

atagtagact gcagtgtcct cagcccttaa gctgttcatc tgcaagtaga gagtattctt 60 
agagttgtct ctagatcact acacc 85 



<210> 556 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 
<400> 556 

gactgggtgt agtgatctag 20 



<210> 557 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 
<400> 557 

cttttctttg ttgccgttgg ggtg 24 



<210> 558 
<211> 15 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 
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<220> 

<221> modif ied_base 
<222> (1) . . (9) 

<223> A, T, C, G, other or unknown 

<400> 558 
nnnnnnnnng caggt 

<210> 559 
<211> 11 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modified_base 
<222> (7).. (11) 

<223> A, T, C, G, other or unknown 

<400> 559 
acctgcnnnn n 



<210> 560 
<211> 10 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modified_base 
<222> (4) . . (7) 

<223> A, T, C, G, other or unknown 

<400> 560 
gatnnnnatc 

<210> 561 
<211> 16 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modified_base 
<222> (7) . . (16) 
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<223> A, T, C, G, other or unknown 

<400> 561 
gaggagnnnn nnnnnn 

<210> 562 
<211> 16 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modif ied_base 
<222> (1)..{10) 

<223> A, T, C, G, other or unknown 

<400> 562 
nnnnnnnnnn ctcctc 

<210> 563 
<211> 10 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modif ied_base 
<222> (7).. (10) 

<223> A, T, C, G, other or unknown 

<400> 563 
ctcttcnnnn 



<210> 564 
<211> 11 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modified_base 
<222> (1) . . (5) 

<223> A, T, C, G, other or unknown 

<400> 564 
nnnnngaaga g 
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<210> 565 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modif ied_base 
<222> <1)..(1S) 

<223> A, T, C, G, other or unknown 
<400> 565 

nnnnnnnnnn nnnnngtccc 



<210> 566 
<211> 12 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> raodified_base 
<222> (4) . . (9) 

<223> A, T, G, G, other or unknown 

<400> 566 
gacnnnnnng tc 

<210> 567 
<211> 11 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modified_base 
<222> (7). .(11) 

<223> A, T, C, G, other or unknown 

<400> 567 
cgtctcnnnn n 



<210> 568 
<211> 12 
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<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modified_base 
<222> (7) . . (12) 

<223> A, T, C, G f other or unknown 

<400> 568 
gtatccnnnn nn 



<210> 569 
<211> 12 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> tnodifiedjbase 
<222> (4) . . (9) 

<223> A, T, C, G, other or unknown 

<400> 569 
gcannnnnnt eg 

<210> 570 
<211> 11 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modified_base 
<222> (4) . . (8) 

<223> A, T, C, G, other or unknown 

<400> 570 
geennnnngg c 



<210> 571 
<211> 11 
<212> DNA 

<213> Artificial Sequence 
<220> 
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<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modified_base 
<222> (7).. (11) 

<223> A, T, C, G, other or unknown 

<400> 571 
ggtctcnnnn n 



<210> 572 
<211> 11 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modified_base 
<222> (4) . . (8) 

<223> A, T, C, G, other or unknown 

<400> 572 
gacnnnnngt c 

<210> 573 
<211> 11 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modified_base 
<222> (4) . . (8) 

<223> A, T, C, G, other or unknown 

<400> 573 
gacnnnnngt c 

<210> 574 
<211> 11 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 



<220> 
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<221> modifieci__base 
<222> (4) . . (8) 

<223> A, T, C, G, other or unknown 

<400> 574 
ccannnnntg g 

<210> 575 
<211> 15 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modifiedjDase 
<222> (4).. (12) 

<223> A, T, C, G, other or unknown 

<400> 575 
ccannnnnnn nntgg 

<210> 576 
<211> 13 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modified_base 
<222> (5).. (9) 

<223> A, T, C, G, other or unknown 

<400> 576 
ggccnnnnng gcc 

<210> 577 
<211> 12 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modified_base 
<222> (4) . . (9) 

<223> A, T, C, G, other or unknown 
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<400> 577 
ccannnnnnt gg 

<210> 578 
<211> 11 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modif iedjaase 
<222> (4) . . (8) 

<223> A, T, C, G, other or unknown 

<400> 578 
cctnnnnnag g 

<210> 579 

<211> 10 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modified_base 

<222> (4) . . (7) 

<223> A, T, C, G, other or unknown 

<400> 579 
gacnnnngtc 



<210> 580 
<211> 15 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 



<220> 
<221> 
<222> 
<223> 



modif ied_base 
(4) (12) 

A, T, C, G, other or unknown 



<400> 580 
ccannnnnnn nntgg 
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<210> 581 
<211> 11 
<212> DNA 

<213> Artificial Sequence 

<220> " 
<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modif ied_base 
<222> (4) . . (8) 

<223> A, T, C, G, other or unknown 
<400> 581 

gcannnnntg c 11 



<210> 582 

<211> 10251 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: CJRA05 
nucleotide sequence 

<220> 

<221> CDS 

<222> (1578) (1916) 

<220> 
<221> CDS 

<222> (2388) . . (2843) 

<220> 
<221> CDS 

<222> (2849) . . (2893) 

<220> 
<221> CDS 

<222> (3189) (4232) 

<220> 
<221> CDS 

<222> (7418) (8119) 

<220> 
<221> CDS 

<222> (8160) . . (9452) 
<400> 582 

aatgctacta ctattagtag aattgatgcc accttttcag ctcgcgcccc aaatgaaaat 60 
atagctaaac aggttattga ccatttgcga aatgtatcta atggtcaaac taaatctact 120 
cgttcgcaga attgggaatc aactgttata tggaatgaaa cttccagaca ccgtacttta 180 
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gttgcatatt 


taaaacatgt 


tgagctacag 


cattatattc 


agcaattaag 


ctctaagcca 


240 


tccgcaaaaa 


tgacctctta 


tcaaaaggag 


caattaaagg 


tactctctaa 


tcctgacctg 


300 


ttggagtttg 


cttccggtct 


ggttcgcttt 


gaagctcgaa 


ttaaaacgcg 


atatttgaag 


360 


tctttcgggc 


ttcctcttaa 


tctttttgat 


gcaatccgct 


ttgcttctga 


ctataatagt 


420 


cagggtaaag 


acctgatttt 


tgatttatgg 


tcattgtcgt 


tttctgaact 


gtttaaagca 


480 


tttgaggggg 


attcaatgaa 


tatttatgac 


gattcegcag 


tattggaege 


tatccagtct 


540 


aaacatttta 


ctattacccc 


ctctggcaaa 


acttcttttg 


caaaagcctc 


tegctatttt 


600 


ggtttttatc 


gtcgtctggt 


aaacgagggt 


tatgatagtg 


ttgetcttae 


tatgectegt 


660 


aattcctttt 


ggcgttatgt 


atetgeatta 


gttgaatgtg 


gtattcctaa 


atctcaactg 


720 


atgaatcttt 


ctacctgtaa 


taatgttgtt 


ccgttagttc 


gttttattaa 


cgtagatttt 


780 


tcttcccaac 


gtcctgactg 


gtataatgag 


ccagttctta 


aaategcata 


aggtaattca 


840 


caatgattaa 


agttgaaatt 


aaaccatctc 


aageccaatt 


tactactcgt 


tctggtgttt 


900 


ctcgtcaggg 


caagecttat 


tcactgaatg 


agcagctttg 


ttacgttgat 


ttgggtaatg 


960 


aatatccggt 


tcttgtcaag 


attactcttg 


atgaaggtca 


gccagcctat 


gcgcctggtc 


1020 


tgtacaccgt 


tcatctgtcc 


tctttcaaag 


ttggtcagtt 


cggttccctt 


atgattgacc 


1080 


gtctgcgcct 


cgttccggct 


aagtaacatg 


gagcaggtcg 


eggatttega 


cacaatttat 


1140 


caggcgatga 


tacaaatctc 


cgttgtactt 


tgtttcgcgc 


ttggtataat 


cgctgggggt 


1200 


caaagatgag 


tgttttagtg 


tattcttttg 


cctctttcgt 


tttaggttgg 


tgccttcgta 


1260 


arte* a t" t~ c* 


gtattttacc 


cgtttaatgg 


aaacttcctc 


atgaaaaagt 


ctttagtcct 


1320 


caaagcctct 


gtagccgttg 


ctaccctcgt 


tecgatgetg 


tetttegctg 


ctgagggtga 


1380 


cgatcccgca 


aaageggect 


ttaactccct 


gcaagcctca 


gcgaccgaat 


atateggtta 


1440 


tgcgtgggcg 


atggttgttg 


teattgtegg 


cgcaactatc 


ggtatcaagc 


tgtttaagaa 


1500 


attcacctcg 


aaagcaagct 


gataaaccga 


tacaattaaa 


ggctcctttt 


ggagcctttt 


1560 


ttttggagat 


tttcaac gtg aaa aaa tta tta ttc gca att cct 
Met Lys Lys Leu Leu Phe Ala lie Pro 


tta gtt 
Leu Val 


1610 



1 5 10 

gtt cct ttc tat tct ggc gcg gec gaa tea cat eta gac ggc gec get 1658 

Val Pro Phe Tyr Ser Gly Ala Ala Glu Ser His Leu Asp Gly Ala Ala 

15 20 25 

gaa act gtt gaa agt tgt tta gca aaa tec cat aca gaa aat tea ttt 1706 

Glu Thr Val Glu Ser Cys Leu Ala Lys Ser His Thr Glu Asn Ser Phe 

30 35 40 
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act aac gtc tgg aaa gac gac aaa act tta gat cgt tac get aac tat 1754 
Thr Asn Val Trp Lys Asp Asp Lys Thr Leu Asp Arg Tyr Ala Asn Tyr 
45 50 55 

gag ggc tgt ctg tgq aat get aca ggc gtt gta gtt tgt act got gac 1802 
Glu Gly Cys Leu Trp Asn Ala Thr Gly Val Val Val Cys Thr 6ly Asp~ 
60 65 70 75 

gaa act cag tgt tac ggt aca tgg gtt cct att ggg ctt get ate cct 1850 
Glu Thr Gin Cys Tyr Gly Thr Trp Val Pro He Gly Leu Ala He Pro 
80 85 90 

gaa aat gag ggt ggt ggc tct gag ggt ggc ggt tct gag ggt ggc ggt 1898 
Glu Asn Glu Gly Gly Gly Ser Glu Gly Gly Gly Ser Glu Gly Gly Gly 
95 100 105 

tct gag ggt ggc ggt act aaacctcctg agtacggtga tacacctatt 1946 
Ser Glu Gly Gly Gly Thr 
110 



cegggctata 


cttatatcaa 


ccctctcgac 


ggcacttatc 


cgcctggtac 


tgagcaaaac 


2006 


cccgctaatc 


ctaatccttc 


tcttgaggag 


tctcagcctc 


ttaatacttt 


catgtttcag 


2066 


aataataggt 


tccgaaatag 


gcagggggca 


ttaactgttt 


ataegggcac 


tgttactcaa 


2126 


ggcactgacc 


ccgttaaaac 


ttattaccag 


tacactcctg 


tatcatcaaa 


agecatgtat 


2186 


gaegcttact 


ggaacggtaa 


attcagagac 


tgcgctttcc 


attctggctt 


taatgaggat 


2246 


ttatttgttt 


gtgaatatca 


aggecaateg 


tctgacctgc 


ctcaacctcc 


tgtcaatget 


2306 


ggeggegget 


ctggtggtgg 


ttctggtggc 


ggctctgagg 


gtggtggctc 


tgagggaggc 


2366 


ggttccggtg 


gtggctctgg 


t tec ggt gat ttt gat 
Ser Gly Asp Phe Asp 
115 


tat gaa aag 
Tyr Glu Lys 
120 


atg gca 
Met Ala 


2417 



aac get aat aag ggg get atg acc gaa aat gee gat gaa aac gcg eta 2465 
Asn Ala Asn Lys Gly Ala Met Thr Glu Asn Ala Asp Glu Asn Ala Leu 
125 130 135 

cag tct gac get aaa ggc aaa ctt gat tct gtc get act gat tac ggt 2513 
Gin Ser Asp Ala Lys Gly Lys Leu Asp Ser Val Ala Thr Asp Tyr Gly 
140 145 150 155 

get get ate gat ggt ttc att ggt gac gtt tec ggc ctt get aat ggt 2561 
Ala Ala He Asp Gly Phe He Gly Asp Val Ser Gly Leu Ala Asn Gly 
160 165 " 170 

aat ggt get act ggt gat ttt get ggc tct aat tec caa atg get caa 2609 
Asn Gly Ala Thr Gly Asp Phe Ala Gly Ser Asn Ser Gin Met Ala Gin 
175 180 185 

gtc ggt gac ggt gat aat tea cct tta atg aat aat ttc cgt caa tat 2657 
Val Gly Asp Gly Asp Asn Ser Pro Leu Met Asn Asn Phe Arg Gin Tyr 
190 195 200 
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tta cct tec etc cct caa teg gtt gaa tgt cgc cct ttt gtc ttt ggc 2705 
Leu Pro Ser Leu Pro Gin Ser Val Glu Cys Arg Pro Phe Val Phe Gly 
205 210 215 

get ggt aaa cca tat gaa ttt tct att gat tgt gac aaa ata aac tta^ 2753 
Ala Gly Lys Pro Tyr Glu Phe Ser He Asp Cys Asp Lys He Ash Leii 
220 225 230 235 

ttc cgt ggt gtc ttt gcg ttt ctt tta tat gtt gec ace ttt atg tat 2801 
Phe Arg Gly Val Phe Ala Phe Leu Leu Tyr Val Ala Thr Phe Met Tyr 
240 245 250 

gta ttt tct acg ttt get aac ata ctg cgt aat aag gag tct taatc atg 2851 
Val Phe Ser Thr Phe Ala Asn He Leu Arg Asn Lys Glu Ser Met 
255 260 265 

cca gtt ctt ttg ggt att ccg tta tta ttg cgt ttc etc ggt 2893 
Pro Val Leu Leu Gly He Pro Leu Leu Leu Arg Phe Leu Gly 
270 ' 275 280 

ttccttctgg taactttgtt eggctatctg cttacttttc ttaaaaaggg etteggtaag 2953 

atagctattg ctatttcatt gtttcttget cttattattg ggcttaactc aattcttgtg 3013 

ggttatctct ctgatattag cgctcaatta ccctctgact ttgttcaggg tgttcagtta 3073 

attctcccgt etaatgeget tccctgtttt tatgttattc tctctgtaaa ggctgetatt 3133 

ttcatttttg aegttaaaca aaaaatcgtt tcttatttgg attgggataa ataat atg 3191 

Met 

get gtt tat ttt gta act ggc aaa tta ggc tct gga aag acg etc gtt 3239 
Ala Val Tyr Phe Val Thr Gly Lys Leu Gly Ser Gly Lys Thr Leu Val 
285 290 295 

age gtt ggt aag att cag gat aaa att gta get ggg tgc aaa ata gca 3287 
Ser Val Gly Lys He Gin Asp Lys He Val Ala Gly Cys Lys He Ala 
300 305 310 

act aat ctt gat tta agg ctt caa aac etc ccg caa gtc ggg agg ttc 3335 
Thr Asn Leu Asp Leu Arg Leu Gin Asn Leu Pro Gin Val Gly Arg Phe 
315 320 325 

get aaa acg cct cgc gtt ctt aga ata ccg gat aag cct tct ata tct 3383 
Ala Lys Thr Pro Arg Val Leu Arg He Pro Asp Lys Pro Ser He Ser 
330 335 340 345 

gat ttg ctt get att ggg cgc ggt aat gat tec tac gat gaa aat aaa 3431 
Asp Leu Leu Ala He Gly Arg Gly Asn Asp Ser Tyr Asp Glu Asn Lys 
350 355 360 

aac ggc ttg ctt gtt etc gat gag tgc ggt act tgg ttt aat ace cgt 3479 
Asn Gly Leu Leu Val Leu Asp Glu Cys Gly Thr Trp Phe Asn Thr Arg 
365 370 * 375 

tct tgg aat gat aag gaa aga cag ccg att att gat tgg ttt eta cat 3527 
Ser Trp Asn Asp Lys Glu Arg Gin Pro He He Asp Trp Phe Leu His 
380 ^ 385 390 
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get cgt aaa tta gga tgg gat att att ttt ctt gtt cag gac tta tct 3575 
Ala Arg Lys Leu Gly Trp Asp lie lie Phe Leu Val Gin Asp Leu Ser 
395 400 405 

att gtt gat aaa cag gcg cgt tct gca tta get gaa cat gtt gtt tat~ 3623 
He Val Asp Lys Gin Ala Arg Ser Ala Leu Ala Glu His Val Val Tyr 
410 415 420 425 

tgt cgt cgt ctg gac aga att act tta cct ttt gtc ggt act tta tat 3671 
Cys Arg Arg Leu Asp Arg He Thr Leu Pro Phe Val Gly Thr Leu Tyr 
430 435 440 

tct ctt att act ggc teg aaa atg cct ctg cct aaa tta cat gtt ggc 3719 
Ser Leu He Thr Gly Ser Lys Met Pro Leu Pro Lys Leu His Val Gly 
445 450 455 

gtt gtt aaa tat ggc gat tct caa tta age cct act gtt gag cgt tgg 3767 
Val Val Lys Tyr Gly Asp Ser Gin Leu Ser Pro Thr Val Glu Arg Trp 
460 465 470 

ctt tat act ggt aag aat ttg tat aac gca tat gat act aaa cag get 3815 
Leu Tyr Thr Gly Lys Asn Leu Tyr Asn Ala Tyr Asp Thr Lys Gin Ala 
475 480 485 

ttt tct agt aat tat gat tec ggt gtt tat tct tat tta acg cct tat 3863 
Phe Ser Ser Asn Tyr Asp Ser Gly Val Tyr Ser Tyr Leu Thr Pro Tyr 
490 " 495 500 505 

tta tea cac ggt egg tat ttc aaa cca tta aat tta ggt cag aag atg 3911 
Leu Ser His Gly Arg Tyr Phe Lys Pro Leu Asn Leu Gly Gin Lys Met 
510 " 515 520 

aaa tta act aaa ata tat ttg aaa aag ttt tct cgc gtt ctt tgt ctt 3959 
Lys Leu Thr Lys He Tyr Leu Lys Lys Phe Ser Arg Val Leu Cys Leu 
525 530 535 

gcg att gga ttt gca tea gca ttt aca tat agt tat ata acc caa cct 4007 
Ala He Gly Phe Ala Ser Ala Phe Thr Tyr Ser Tyr lie Thr Gin Pro 
540 545 550 

aag ccg gag gtt aaa aag gta gtc tct cag acc tat gat ttt gat aaa 4055 
Lys Pro Glu Val Lys Lys Val Val Ser Gin Thr Tyr Asp Phe Asp Lys 
555 560 565 

ttc act att gac tct tct cag cgt ctt aat eta age tat cgc tat gtt 4103 
Phe Thr He Asp Ser Ser Gin Arg Leu Asn Leu Ser Tyr Arg Tyr Val 
570 575 580 585 

ttc aag gat tct aag gga aaa tta att aat age gac gat tta cag aag 4151 
Phe Lys Asp Ser Lys Gly Lys Leu He Asn Ser Asp Asp Leu Gin Lys 
590 595 600 

caa ggt tat tea etc aca tat att gat tta tgt act gtt tec att aaa 4199 
Gin Gly Tyr Ser Leu Thr Tyr He Asp Leu Cys Thr Val Ser He Lys 
605 * 610 * 615 



aaa ggt aat tea aat gaa att gtt aaa tgt aat taattttgtt ttcttgatgt 4252 
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Lys Gly Asn Ser Asn Glu He Val Lys Cys Asn 
620 625 



ttgtttcatc 


atcttctttt 


gctcaggtaa 


ttgaaatgaa 


taattcgcct 


ctgcgcgatt 


4312 


ttgtaacttg 


gtattcaaag 


caatcaggcg 


aatccgttat 


tgtttctccc 


gatgtaaaag 


4372 


gtactgttac 


tgtatattca 


tctgacgtta 


aacctgaaaa 


tctacgcaat 


ttctttattt 


4432 


ctgttttacg 


tgcaaataat 


tttgatatgg 


taggttctaa 


cccttccatt 


attcagaagt 


4492 


ataatccaaa 


caatcaggat 


tatattgatg 


aattgccatc 


atctgataat 


caggaatatg 


4 552 


atgataattc 


cgctccttct 


ggtggtttct 


ttgttccgca 


aaatgataat 


gttactcaaa 


4612 


cttttaaaat 


taataacgtt 


cgggcaaagg 


atttaatacg 


agttgtcgaa 


ttgtttgtaa 


4672 


agtctaatac 


ttctaaatcc 


tcaaatgtat 


tatctattga 


cggctctaat 


ctattagttg 


4732 


ttagtgctcc 


taaagatatt 


ttagataacc 


ttcctcaatt 


cctttcaact 


gttgatttgc 


4792 


caactgacca 


gatattgatt 


gagggtttga 


tatttgaggt 


tcagcaaggt 


gatgctttag 


4852 


atttttcatt 


tgctgctggc 


tctcagcgtg 


gcactgttgc 


aggcggtgtt 


aatactgacc 


4912 


gcctcacctc 


tgttttatct 


tctgctggtg 


gttcgttcgg 


tatttttaat 


ggcgatgttt 


4972 


tagggctatc 


agttcgcgca 


ttaaagacta 


atagccattc 


aaaaatattg 


tctgtgccac 


5032 


gtattcttac 


gctttcaggt 


cagaagggtt 


ctatctctgt 


tggccagaat 


gtccctttta 


5092 


ttactggtcg 


tgtgactggt 


gaatctgcca 


atgtaaataa 


tccatttcag 


acgattgagc 


5152 


gtcaaaatgt 


aggtatttcc 


atgagcgttt 


ttcctgttgc 


aatggctggc 


ggtaatattg 


5212 


ttctggatat 


taccagcaag 


gccgatagtt 


tgagttcttc 


tactcaggca 


agtgatgtta 


5272 


ttactaatca 


aagaagtatt 


gctacaacgg 


ttaatttgcg 


tgatggacag 


actcttttac 


5332 


tcggtggcct 


cactgattat 


aaaaacactt 


ctcaggattc 


tggcgtaccg 


ttcctgtcta 


5392 


aaatcccttt 


aatcggcctc 


ctgtttagct 


cccgctctga 


ttctaacgag 


gaaagcacgt 


5452 








ctcnccc* tata 


gcggcgcatt 


aaaccrcoGca 


5512 


ggtgtggtgg 


ttacgcgcag 


cgtgaccgct 


acacttgcca 


gcgccctagc 


gcccgctcct 


5572 


ttcgctttct 


tcccttcctt 


tctcgccacg 


ttcgccggct 


ttccccgtca 


agctctaaat 


5632 


cgggggctcc 


ctttagggtt 


ccgatttagt 


gctttacggc 


acctcgaccc 


caaaaaactt 


5692 


gatttgggtg 


atggttcacg 


tagtgggcca 


tcgccctgat 


agacggtttt 


tcgccctttg 


5752 


acgttggagt 


ccacgttctt 


taatagtgga 


ctcttgttcc 


aaactggaac 


aacactcaac 


5812 


cctatctcgg 


gctattcttt 


tgatttataa 


gggattttgc 


cgatttcgga 


accaccatca 


5872 


aacaggattt 


tcgcctgctg 


gggcaaacca 


gcgtggaccg 


cttgctgcaa 


ctctctcagg 


5932 
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gccaggcggt 


gaagggcaat 


cagctgttgc 


ccgtctcact 


ggtgaaaaga 


aaaaccaccc 


5992 


tggatccaag cttgcaggtg 


gcacttttcg 


gggaaatgtg 


cgcggaaccc 


ctatttgttt 


6052 


atttttctaa 


atacattcaa 


atatgtatcc 


gctcatgaga 


caataaccct 


gataaatgct 


6112 


tcaataatat 


tgaaaaagga 


agagtatgag 


tattcaacat 


ttccgtgtcg 


cccttattcc 


6172 


cttttttgcg gcattttgcc 


ttcctgtttt 


tgctcaccca 


gaaacgctgg 


tgaaagtaaa 


6232 


agatgctgaa gatcagttgg 


gcgcactagt 


gggttacatc 


gaactggatc 


tcaacagcgg 


6292 


taagatcctt 


gagagttttc 


gccccgaaga 


acgttttcca 


atgatgagca 


cttttaaagt 


6352 


tctgctatgt 


ggcgcggtat 


tatcccgtat 


tgacgccggg 


caagagcaac 


tcggtcgccg 


6412 


catacactat 


tctcagaatg 


acttggttga 


gtactcacca 


gtcacagaaa 


agcatcttac 


6472 


ggatggcatg 


acagtaagag 


aattatgcag 


tgctgccata 


accatgagtg 


ataacactgc 


6532 


ggccaactta 


cttctgacaa 


cgatcggagg 


accgaaggag 


ctaaccgctt 


ttttgcacaa 


6592 


catgggggat 


catgtaactc 


gccttgatcg 


ttgggaaccg 


gagctgaatg 


aagccatacc 


6652 


aaacgacgag 


cgtgacacca 


cgatgcctgt 


agcaatggca 


acaacgttgc 


gcaaactatt 


6712 


aactggcgaa 


ctacttactc 


tagcttcccg 


gcaacaatta 


atagactgga 


tggaggcgga 


6772 


taaagttgca ggaccacttc 


tgcgctcggc 


ccttccggct 


ggctggttta 


ttgctgataa 


6832 


atctggagcc ggtgagcgtg 


ggtctcgcgg 


tatcattgca 


gcactggggc 


cagatggtaa 


6892 


gccctcccgt 


atcgtagtta 


tctacacgac 


ggggagtcag 


gcaactatgg 


atgaacgaaa 


6952 


tagacagatc gctgagatag 


gtgcctcact 


gattaagcat 


tggtaactgt 


cagaccaagt 


7012 


ttactcatat 


atactttaga 


t tydLCuoad 


aCLccaLttt 


taatttaaaa 


ggatctaggt 


7072 


gaagatcctt 


tttgataatc 


tcatgaccaa 


aatcccttaa 


cgtgagtttt 


cgttccactg 


7132 


tacgtaagac ccccaagctt 


gtcgactgaa 


tggcgaatgg 


cgctttgcct 


ggtttccggc 


7192 


accagaagcg 


gtgccggaaa 


gctggctgga 


gtgcgatctt 


cctgacgctc 


gagcgcaacg 


7252 


caattaatgt 


gagttagctc 


actcattagg 


caccccaggc 


tttacacttt 


atgcttccgg 


7312 


ctcgtatgtt 


gtgtggaatt 


gtgagcggat 


aacaatttca 


cacaggaaac 


agctatgacc 


7372 


atgattacgc caagctttgg 


agcctttttt 


ttggagattt 


tcaac gtg aaa aaa tta 


7429 



Met Lys Lys Leu 
630 



tta ttc gca att cct tta gtt gtt cct ttc tat tct cac agt gca caa 7477 
Leu Phe Ala He Pro Leu Val Val Pro Phe Tyr Ser His Ser Ala Gin 
635 640 645 

gac ate cag atg acc cag tct cca gec acc ctg tct ttg tct cca ggg 7525 
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Asp He Gin Met Thr Gin Ser Pro Ala Thr Leu Ser Leu Ser Pro Gly 
650 655 660 

gaa aga gcc acc etc tec tgc agg gee agt cag ggt gtt age age tac 7573 

Glu Arg Ala Thr Leu Ser Cys Arg Ala Ser Gin Gly Val Ser Ser Tyr 

665 670 675 680~ 

tta gcc tgg tac cag cag aaa cct ggc cag get ccc agg etc etc ate 7621 

Leu Ala Trp Tyr Gin Gin Lys Pro Gly Gin Ala Pro Arg Leu Leu He 
685 690 695 

tat gat gca tec aac agg gcc act ggc ate cca gcc agg ttc agt ggc 7 669 

Tyr Asp Ala Ser Asn Arg Ala Thr Gly He Pro Ala Arg Phe Ser Gly 
700 705 710 

agt ggg cct ggg aca gac ttc act etc acc ate age age eta gag cct 7717 

Ser Gly Pro Gly Thr Asp Phe Thr Leu Thr He Ser Ser Leu Glu Pro 

715 720 725 

gaa gat ttt gca gtt tat tac tgt cag cag cgt aac tgg cat ccg tgg 7765 

Glu Asp Phe Ala Val Tyr Tyr Cys Gin Gin Arg Asn Trp His Pro Trp 
730 735 740 

acg ttc ggc caa ggg acc aag gtg gaa ate aaa cga act gtg get gca 7813 

Thr Phe Gly Gin Gly Thr Lys Val Glu He Lys Arg Thr Val Ala Ala 

745 750 755 760 

cca tct gtc ttc ate ttc ccg cca tct gat gag cag ttg aaa tct gga 7861 

Pro Ser Val Phe He Phe Pro Pro Ser Asp Glu Gin Leu Lys Ser Gly 
765 770 775 

act gcc tct gtt gtg tgc ctg ctg aat aac ttc tat ccc aga gag gcc 7909 

Thr Ala Ser Val Val Cys Leu Leu Asn Asn Phe Tyr Pro Arg Glu Ala 
780 785 790 

aaa gta cag tgg aag gtg gat aac gcc etc caa teg ggt aac tec cag 7 957 

Lys Val Gin Trp Lys Val Asp Asn Ala Leu Gin Ser Gly Asn Ser Gin 

795 800 805 

gag agt gtc aca gag egg gac age aag gac age acc tac age etc age 8005 

Glu Ser Val Thr Glu Arg Asp Ser Lys Asp Ser Thr Tyr Ser Leu Ser 
810 815 820 

age acc ctg acg ctg age aaa gca gac tac gag aaa cac aaa gtc tac 8053 

Ser Thr Leu Thr Leu Ser Lys Ala Asp Tyr Glu Lys His Lys Val Tyr 

825 830 835 840 

gcc tgc gaa gtc acc cat cag ggc ctg age teg ccc gtc aca aag age 8101 

Ala Cys Glu Val Thr His Gin Gly Leu Ser Ser Pro Val Thr Lys Ser 
845 850 855 

ttc aac agg gga gag tgt taataaggcg cgccaattct atttcaagga 8149 
Phe Asn Arg Gly Glu Cys 
860 

gacagtcata atg aaa tac eta ttg cct acg gca gcc get gga ttg tta 8198 

Met Lys Tyr Leu Leu Pro Thr Ala Ala Ala Gly Leu Leu 

865 870 875 
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tta etc gcg gec cag ccg gec atg gec gaa gtt caa ttg tta gag tct 
Leu Leu Ala Ala Gin Pro Ala Met Ala Glu Val Gin Leu Leu Glu Ser 
880 885 890 



8246 



ggt ggc ggt ctt gtt cag cct ggt ggt tct tta cgt ctt tct tgc get - 
Gly Gly Gly Leu Val Gin Pro Gly Gly Ser Leu Arg Leu Ser Cys Ala 
895 900 905 



8294 



get tec gga ttc act ttc tct act tac gag atg cgt tgg gtt cgc caa 
Ala Ser Gly Phe Thr Phe Ser Thr Tyr Glu Met Arg Trp Val Arg Gin 
910 915 920 



8342 



get cct ggt aaa ggt ttg gag tgg gtt tct tat ate get cct tct ggt 
Ala Pro Gly Lys Gly Leu Glu Trp Val Ser Tyr lie Ala Pro Ser Gly 
925 930 935 



8390 



ggc gat act get tat get gac tec gtt aaa ggt cgc ttc act ate tct 
Gly Asp Thr Ala Tyr Ala Asp Ser Val Lys Gly Arg Phe Thr lie Ser 
940 945 950 955 



8438 



aga gac aac tct aag aat act etc tac ttg cag atg aac age tta agg 
Arg Asp Asn Ser Lys Asn Thr Leu Tyr Leu Gin Met Asn Ser Leu Arg 
.960 965 970 



8486 



get gag gac act gca gtc tac tat tgt gcg agg agg etc gat ggc tat 
Ala Glu Asp Thr Ala Val Tyr Tyr Cys Ala Arg Arg Leu Asp Gly Tyr 
975 980 985 



8534 



att tec tac tac tac ggt atg gac gtc tgg ggc caa ggg ace acg gtc 
He Ser Tyr Tyr Tyr Gly Met Asp Val Trp Gly Gin Gly Thr Thr Val 
990 995 1000 



8582 



acc gtc tea age gec tec acc aag ggc cca teg gtc ttc ccc ctg gca 
Thr Val Ser Ser Ala Ser Thr Lys Gly Pro Ser Val Phe Pro Leu Ala 
1005 1010 1015 



8630 



ccc tec tec aag age acc tct ggg ggc aca gcg gec ctg ggc tgc ctg 
Pro Ser Ser Lys Ser Thr Ser Gly Gly Thr Ala Ala Leu Gly Cys Leu 
1020 1025 1030 1035 



8678 



gtc aag gac tac ttc ccc gaa ccg gtg acg gtg teg tgg aac tea ggc 
Val Lys Asp Tyr Phe Pro Glu Pro Val Thr Val Ser Trp Asn Ser Gly 
1040 1045 1050 



8726 



gec ctg acc age ggc gtc cac acc ttc ccg get gtc eta cag tec tea 
Ala Leu Thr Ser Gly Val His Thr Phe Pro Ala Val Leu Gin Ser Ser 
1055 1060 1065 



8774 



gga etc tac tec etc age age gta gtg acc gtg ccc tec age age ttg 
Gly Leu Tyr Ser Leu Ser Ser Val Val Thr Val Pro Ser Ser Ser Leu 
1070 1075 1080 



8822 



ggc acc cag acc tac ate tgc aac gtg aat cac aag ccc age aac acc 
Gly Thr Gin Thr Tyr He Cys Asn Val Asn His Lys Pro Ser Asn Thr 
1085 1090 1095 



8870 



aag gtg gac aag aaa gtt gag ccc aaa tct tgt gcg gee gca cat cat 



8918 
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Lys Val Asp Lys Lys Val Glu Pro Lys Ser Cys Ala Ala Ala His His 
1100 1105 1110 1115 

cat cac cat cac .ggg gcc gca gaa caa aaa etc ate tea gaa gag gat 8966 
His His His His Gly Ala Ala Glu Gin Lys Leu He Ser Glu Glu Asp_ 
1120 H25 1130 



ctg aat ggg gcc gca tag get age tct get wsy ggy gay tty gay tay 
Leu Asn Gly Ala Ala Gin Ala Ser Ser Ala Ser Gly Asp Phe Asp Tyr 
1135 1140 H45 



acy gay tay ggy gey gcc ate gay ggy tty aty ggy gay gtc wsy ggy 
Thr Asp Tyr Gly Ala Ala He Asp Gly Phe He Gly Asp Val Ser Gly 
1180 1185 1190 1195 



tty mgw car tay ytw cck tcy cty cck car wsk gty gar tgy cgy ccw 

Phe Arg Gin Tyr Leu Pro Ser Leu Pro Gin Ser Val Glu Cys Arg Pro 
1230 1235 1240 

tty gty tty wsy gey ggy aar ccw tay gar tty wsy aty gay tgy gay 

Phe Val Phe Ser Ala Gly Lys Pro Tyr Glu Phe Ser He Asp Cys Asp 
1245 1250 1255 



acy tty atg tay gtw tty wsy ack tty gey aay atw ytr cgy aay aar 
Thr Phe Met Tyr Val Phe Ser Thr Phe Ala Asn He Leu Arg Asn Lys 
1280 1285 1290 

gar wsy tagtgatctc etaggaagee cgcctaatga gcgggctttt tttttctggt 
Glu Ser . 



9014 



gar aar atg get aaw gey aay aar ggs gey atg acy gar aay gey gay 9062 
Glu Lys Met Ala Asn Ala Asn Lys Gly Ala Met Thr Glu Asn Ala Asp 
1150 1155 1160 

gar aay gck ytr car wsy gay gey aar ggy aar ytw gay wsy gtc gck 9110 
Glu Asn Ala Leu Gin Ser Asp Ala Lys Gly Lys Leu Asp Ser Val Ala 
1165 H70 H75 



9158 



ytk gey aay ggy aay ggy gey acy ggw gay tty gew ggy tck aat tcy 9206 
Leu Ala Asn Gly Asn Gly Ala Thr Gly Asp Phe Ala Gly Ser Asn Ser 
1200 1205 1210 

car atg gey car gty ggw gay ggk gay aay wsw cck ytw atg aay aay 9254 
Gin Met Ala Gin Val Gly Asp Gly Asp Asn Ser Pro Leu Met Asn Asn 
1215 1220 1225 



9302 



9350 



aar atra aay ytw tty cgy ggy gty tty gck tty ytk yta tay gty gey 9398 
Lys He Asn Leu Phe Arg Gly Val Phe Ala Phe Leu Leu Tyr Val Ala 
1260 1265 1270 1275 



9446 



9502 



atgcatcctg aggecgatae tgtcgtcgtc ccctcaaact ggcagatgea eggttacgat 9562 

gcgcccatct acaccaacgt gacctatccc attaeggtea atccgccgtt tgttcccacg 9622 

gagaatccga cgggttgtta ctcgctcaca tttaatgttg atgaaagctg gctacaggaa 9682 

ggccagacgc gaattatttt tgatggcgtt cctattggtt aaaaaatgag ctgatttaac 9742 
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aaaaatttaa tgcgaatttt aacaaaatat taacgtttac aatttaaata tttgcttata 9802 
caatcttcct gtttttgggg cttttctgat tatcaaccgg ggtacatatg attgacatgc 9862 
tagttttacg attaccgttc atcgattctc ttgtttgctc eagaetctca ggcaatgace 9927 
tgatagcctt tgtagatctc tcaaaaatag ctaccctctc cggcattaat ttatcagcta 9982 
gaacggttga atatcatatt gatggtgatt tgactgtctc cggcctttct cacccttttg 10042 
aatctttacc tacacattac tcaggcattg catttaaaat atatgagggt tctaaaaatt 10102 
tttatccttg cgttgaaata aaggcttctc ccgcaaaagt attacagggt cataatgttt 10162 
ttggtacaac cgatttagct ttatgctctg aggctttatt gcttaatttt gctaattctt 10222 
tgccttgcct gtatgattta ttggatgtt 10251 



<210> 583 
<211> 113 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: CJRA05 
protein sequence 

<400> 583 

Met Lys Lys Leu Leu Phe Ala He Pro Leu Val Val Pro Phe Tyr Ser 
15 10 15 

Gly Ala Ala Glu Ser His Leu Asp Gly Ala Ala Glu Thr Val Glu Ser 
20 25 30 

Cys Leu Ala Lys Ser His Thr Glu Asn Ser Phe Thr Asn Val Trp Lys 
35 40 45 

Asp Asp Lys Thr Leu Asp Arg Tyr Ala Asn Tyr Glu Gly Cys Leu Trp 
50 55 60 

Asn Ala Thr Gly Val Val Val Cys Thr Gly Asp Glu Thr Gin Cys Tyr 
65 70 75 80 

Gly Thr Trp Val Pro He Gly Leu Ala He Pro Glu Asn Glu Gly Gly 
85 90 95 

Gly Ser Glu Gly Gly Gly Ser Glu Gly Gly Gly Ser Glu Gly Gly Gly 
100 105 110 

Thr 



<210> 584 

<211> 152 

<212> PRT 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: CJRA05 
protein sequence 

<400> 584 

Ser Gly Asp Phe Asp Tyr Glu Lys Met Ala Asn Ala Asn Lys Gly Ala 
1 5 10 15 

Met Thr Glu Asn Ala Asp Glu Asn Ala Leu Gin Ser Asp Ala Lys Gly 
20 25 30 

Lys Leu Asp Ser Val Ala Thr Asp Tyr Gly Ala Ala lie Asp Gly Phe 
35 40 45 

He Gly Asp Val Ser Gly Leu Ala Asn Gly Asn Gly Ala Thr Gly Asp 
50 55 60 

Phe Ala Gly Ser Asn Ser Gin Met Ala Gin Val Gly Asp Gly Asp Asn 
65 70 75 80 

Ser Pro Leu Met Asn Asn Phe Arg Gin Tyr Leu Pro Ser Leu Pro Gin 
85 90 95 

Ser Val Glu Cys Arg Pro Phe Val Phe Gly Ala Gly Lys Pro Tyr Glu 
100 105 110 

Phe Ser He Asp Cys Asp Lys He Asn Leu Phe Arg Gly Val Phe Ala 
115 120 125 

Phe Leu Leu Tyr Val Ala Thr Phe Met Tyr Val Phe Ser Thr Phe Ala 
130 135 140 

Asn He Leu Arg Asn Lys Glu Ser 
145 150 



<210> 585 
<211> 15 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: CJRA05 
peptide sequence 

<400> 585 

Met Pro Val Leu Leu Gly He Pro Leu Leu Leu Arg Phe Leu Gly 
1 5 10 15 



<210> 586 
<211> 348 
<212> PRT 
<213> Artificial 



Sequence 



<220> 

<223> Description of Artificial Sequence: CJRA05 
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protein sequence 



<400> 586 



Met 
1 


Ala 


Val 


Tyr 


Phe 
5 


val 


Thr Gly Lys 


Leu 
10 


Gly Ser 


Gly 


Lys 


Thr 
15 


Leu 


val 


Ser 


Val 


Gly 
20 


Lys 


He 


Gin Asp Lys 
25 


He 


Val 


Ala 


Gly 


Cys 
30 


Lys 


He 


Ala 


Thr 


Asn 
35 


Leu 


Asp 


Leu 


Arg Leu 
40 


Gin Asn 


Leu 


Pro 


Gin 
45 


Val 


Gly 


Arg 


Phe 


Ala 
50 


Lys 


Thr 


Pro 


Arg 


Val Leu Arg 
55 


He 


Pro Asp 
60 


Lys 


Pro 


Ser 


He 


Ser 
65 


Asp 


Leu 


Leu 


Ala 


He 
70 


Gly Arg Gly Asn 


Asp 
75 


Ser 


Tyr 


Asp 


Glu 


Asn 
80 


Lys 


Asn 


Gly 


Leu 


Leu 
85 


Val 


Leu Asp Glu Cys 
90 


Gly Thr 


Trp 


Phe 


Asn 
95 


Thr 


Arg 


Ser 


Trp 


Asn 
100 


Asp 


Lys 


Glu Arg 


Gin 
105 


Pro 


He 


He 


Asp 


Trp 
110 


Phe 


Leu 


His 


Ala 


Arg 
115 


Lys 


Leu 


Gly 


Trp Asp 
120 


He 


He 


Phe 


Leu 


Val 
125 


Gin 


Asp 


Leu 


Ser 


lie 
130 


Val 


Asp 


Lys 


Gin 


Ala Arg 
135 


Ser 


Ala 


Leu 


Ala 
140 


Glu 


His 


Val 


Val 


Tyr 
145 


Cys 


Arg 


Arg 


Leu 


Asp 
150 


Arg He 


Thr 


Leu 


Pro 
155 


Phe 


Val 


Gly 


Thr 


Leu 
160 


Tyr 


Ser 


Leu 


lie 


Thr 
165 


Gly 


Ser Lys 


Met 


Pro 
170 


Leu 


Pro 


Lys 


Leu 


His 
175 


Val 


Gly 


Val 


Val 


Lys 
180 


Tyr 


Gly 


Asp Ser 


Gin 
185 


Leu 


Ser 


Pro 


Thr 


Val 
190 


Glu 


Arg 


Trp 


Leu 


Tyr 
195 


Thr 


Gly 


Lys 


Asn Leu 
200 


Tyr Asn 


Ala 


Tyr 


Asp 
205 


Thr 


Lys 


Gin 


Ala 


Phe 
210 


Ser 


Ser 


Asn 


Tyr 


Asp Ser 
215 


Gly Val 


Tyr 


Ser 
220 


Tyr 


Leu 


Thr 


Pro 


Tyr 
225 


Leu 


Ser 


His 


Gly 


Arg 
230 


Tyr Phe 


Lys 


Pro 


Leu 
235 


Asn 


Leu 


Gly 


Gin 


Lys 
240 


Met 


Lys 


Leuj 


Thr 


Lys 
245 


He 


Tyr Leu 


Lys 


Lys 
250 


Phe 


Ser 


Arg 


Val 


Leu 
255 


Cys 


Leu 


Ala 


lie 


Gly 
260 


Phe 


Ala 


Ser Ala 


Phe 
265 


Thr 


Tyr 


Ser 


Tyr 


He 
270 


Thr 


Gin 


Pro 


Lys 


Pro 
275 


Glu 


Val 


Lys 


Lys Val 
280 


Val 


Ser 


Gin 


Thr 


Tyr 
285 


Asp 


Phe 


Asp 


Lys 


Phe 


Thr 


lie 


Asp 


Ser 


Ser Gin 


Arg Leu 


Asn 


Leu 


Ser 


Tyr 


Arg 


Tyr 
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290 295 300 

Val Phe Lys Asp Ser Lys Gly Lys Leu He Asn Ser Asp Asp Leu Gin 
305 310 315 320 

Lys Gin Gly Tyr Ser Leu Thr Tyr He Asp Leu Cys Thr Val Ser Ile~~ 
325 330 335 

Lys Lys Gly Asn Ser Asn Glu He Val Lys Cys Asn 
340 345 



<210> 587 
<211> 234 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: CJRA05 
protein sequence 

<400> 587 

Met Lys Lys Leu Leu Phe Ala He Pro Leu Val Val Pro Phe Tyr Ser 
15 10 15 

His Ser Ala Gin Asp He Gin Met Thr Gin Ser Pro Ala Thr Leu Ser 
20 25 30 

Leu Ser Pro Gly Glu Arg Ala Thr Leu Ser Cys Arg Ala Ser Gin Gly 
35 40 45 

Val Ser Ser Tyr Leu Ala Trp Tyr Gin Gin Lys Pro Gly Gin Ala Pro 
50 55 60 

Arg Leu Leu He Tyr Asp Ala Ser Asn Arg Ala Thr Gly lie Pro Ala 
65 70 75 80 

Arg Phe Ser Gly Ser Gly Pro Gly Thr Asp Phe Thr Leu Thr He Ser 
85 * 90 95 

Ser Leu Glu Pro Glu Asp Phe Ala Val Tyr Tyr Cys Gin Gin Arg Asn 
100 105 110 

Trp His Pro Trp Thr Phe Gly Gin Gly Thr Lys Val Glu He Lys Arg 
115 120 125 

Thr Val Ala Ala Pro Ser Val Phe He Phe Pro Pro Ser Asp Glu Gin 
130 135 140 

Leu Lys Ser Gly Thr Ala Ser Val Val Cys Leu Leu Asn Asn Phe Tyr 
145 150 155 160 

Pro Arg Glu Ala Lys Val Gin Trp Lys Val Asp Asn Ala Leu Gin Ser 
165 170 175 

Gly Asn Ser Gin Glu Ser Val Thr Glu Arg Asp Ser Lys Asp Ser Thr 
180 185 190 
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Tyr Ser Leu Ser Ser Thr Leu Thr Leu Ser Lys Ala Asp Tyr Glu Lys 
195 200 205 



His Lys Val Tyr Ala Cys Glu Val Thr His Gin Gly Leu Ser Ser Pro 
210 ** ' 215 220 

Val Thr Lys Ser Phe Asn Arg Gly Glu Cys 
225 230 



<210> 588 
<211> 431 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: CJRA05 
protein sequence 

<400> 588 

Met Lys Tyr Leu Leu Pro Thr Ala Ala Ala Gly Leu Leu Leu Leu Ala 
1 5 10 15 

Ala Gin Pro Ala Met Ala Glu Val Gin Leu Leu Glu Ser Gly Gly Gly 
20 25 30 

Leu Val Gin Pro Gly Gly Ser Leu Arg Leu Ser Cys Ala Ala Ser Gly 
35 40 45 

Phe Thr Phe Ser Thr Tyr Glu Met Arg Trp Val Arg Gin Ala Pro Gly 
50 55 60 

Lys Gly Leu Glu Trp Val Ser Tyr lie Ala Pro Ser Gly Gly Asp Thr 
65 70 75 80 

Ala Tyr Ala Asp Ser Val Lys Gly Arg Phe Thr He Ser Arg Asp Asn 
85 90 95 

Ser Lys Asn Thr Leu Tyr Leu Gin Met Asn Ser Leu Arg Ala Glu Asp 
100 105 110 

Thr Ala Val Tyr Tyr Cys Ala Arg Arg Leu Asp Gly Tyr He Ser Tyr 
115 120 125 

Tyr Tyr Gly Met Asp Val Trp Gly Gin Gly Thr Thr Val Thr Val Ser 
130 135 140 



Ser Ala Ser Thr Lys Gly Pro Ser Val Phe Pro Leu Ala Pro Ser Ser 
145 150 155 160 

Lys Ser Thr Ser Gly Gly Thr Ala Ala Leu Gly Cys Leu Val Lys Asp 
165 * 170 175 

Tyr Phe Pro Glu Pro Val Thr Val Ser Trp Asn Ser Gly Ala Leu Thr 
180 185 190 

Ser Gly Val His Thr Phe Pro Ala Val Leu Gin Ser Ser Gly Leu Tyr 
195 200 205 
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Ser Leu Ser Ser Val Val Thr Val Pro Ser Ser Ser Leu Gly Thr Gin 
210 215 220 

Thr Tyr He Cys Asn Val Asn His Lys Pro Ser Asn Thr Lys Val Asp__ 
225 230 235 240 

Lys Lys Val Glu Pro Lys Ser Cys Ala Ala Ala His His His His His 
245 250 255 

His Gly Ala Ala Glu Gin Lys Leu He Ser Glu Glu Asp Leu Asn Gly 
260 265 270 

Ala Ala Gin Ala Ser Ser Ala Ser Gly Asp Phe Asp Tyr Glu Lys Met 
275 280 285 

Ala Asn Ala Asn Lys Gly Ala Met Thr Glu Asn Ala Asp Glu Asn Ala 
290 295 300 

Leu Gin Ser Asp Ala Lys Gly Lys Leu Asp Ser Val Ala Thr Asp Tyr 
305 310 315 320 

Gly Ala Ala He Asp Gly Phe He Gly Asp Val Ser Gly Leu Ala Asn 
325 330 335 

Gly Asn Gly Ala Thr Gly Asp Phe Ala Gly Ser Asn Ser Gin Met Ala 
340 345 350 

Gin Val Gly Asp Gly Asp Asn Ser Pro Leu Met Asn Asn Phe Arg Gin 
355 * 360 365 

Tyr Leu Pro Ser Leu Pro Gin Ser Val Glu Cys Arg Pro Phe Val Phe 
370 375 ■ 380 

Ser Ala Gly Lys Pro Tyr Glu Phe Ser He Asp Cys Asp Lys He Asn 
385 390 395 400 

Leu Phe Arg Gly Val Phe Ala Phe Leu Leu Tyr Val Ala Thr Phe Met 
405 410 415 

Tyr Val Phe Ser Thr Phe Ala Asn He Leu Arg Asn Lys Glu Ser 
420 425 430 



<210> 589 
<211> 5 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Illustrative 
peptide 

<400> 589 

Glu Gly Gly Gly Ser 
1 5 
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<210> 590 
<211> 1275 
<212> DNA 

<213> Unknown Organism 

<220> 

<221> CDS 

<222> (1) . . (1272) 

<220> 

<223> Description of Unknown Organism: M13 nucleotide 
sequence 

<400> 590 

gtg aaa aaa tta tta ttc gca att cct tta gtt gtt cct ttc tat tct 48 
Met Lys Lys Leu Leu Phe Ala lie Pro Leu Val Val Pro Phe Tyr Ser 
15 10 15 

cac tec get gaa act gtt gaa agt tgt tta gca aaa ccc cat aca gaa 96 
His Ser Ala Glu Thr Val Glu Ser Cys Leu Ala Lys Pro His Thr Glu 
20 25 30 

aat tea ttt act aac gtc tgg aaa gac gac aaa act tta gat cgt tac 144 
Asn Ser Phe Thr Asn Val Trp^Lys Asp Asp Lys Thr Leu Asp Arg Tyr 
35 40 45 

get aac tat gag ggt tgt ctg tgg aat get aca ggc gtt gta gtt tgt 192 
Ala Asn Tyr Glu Gly Cys Leu Trp Asn Ala Thr Gly Val Val Val Cys 
50 55 60 

act ggt gac gaa act cag tgt tac ggt aca tgg gtt cct att ggg ctt 240 
Thr Gly Asp Glu Thr Gin Cys Tyr Gly Thr Trp Val Pro He Gly Leu 
65 "* 70 75 80 

get ate cct gaa aat gag ggt ggt ggc tct gag ggt ggc ggt tct gag 288 
Ala He Pro Glu Asn Glu Gly Gly Gly Ser Glu Gly Gly Gly Ser Glu 
85 90 95 

ggt ggc ggt tct gag ggt ggc ggt act aaa cct cct gag tac ggt gat 336 
Gly Gly Gly Ser Glu Gly Gly Gly Thr Lys Pro Pro Glu Tyr Gly Asp 
100 105 110 

aca cct att ccg ggc tat act tat ate aac cct etc gac ggc act tat 384 
Thr Pro He Pro Gly Tyr Thr Tyr He Asn Pro Leu Asp Gly Thr Tyr 
115 120 125 

ccg cct ggt act gag caa aac ccc get aat cct aat cct tct ctt gag 4 32 
Pro Pro Gly Thr Glu Gin Asn Pro Ala Asn Pro Asn Pro Ser Leu Glu 
130 135 140 

gag tct cag cct ctt aat act ttc atg ttt cag aat aat agg ttc cga 480 
Glu Ser Gin Pro Leu Asn Thr Phe Met Phe Gin Asn Asn Arg Phe Arg 
145 150 155 160 

aat agg cag ggg gca tta act gtt tat acg ggc act gtt act caa ggc 528 
Asn Arg Gin Gly Ala Leu Thr Val Tyr Thr Gly Thr Val Thr Gin Gly 
165 170 175 
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act gac ccc gtt aaa act tat tac cag tac act cct gta tea tea aaa 576 

Thr Asp Pro Val Lys Thr Tyr Tyr Gin Tyr Thr Pro Val Ser Ser Lys 

180 185 190 

gec atg tat gac get tac tgg aac ggt aaa ttc aga gac tgc get ttc^ 624 

Ala Met Tyr Asp Ala Tyr Trp Asn Gly Lys Phe Arg Asp Cys Ala Phe 

195 200 205 

cat tct ggc ttt aat gag gat cca ttc gtt tgt gaa tat caa ggc caa 672 

His Ser Gly Phe Asn Glu Asp Pro Phe Val Cys Glu Tyr Gin Gly Gin 

210 215 220 

teg tct gac ctg cct caa cct cct gtc aat get ggc ggc ggc tct ggt 720 

Ser Ser Asp Leu Pro Gin Pro Pro Val Asn Ala Gly Gly Gly Ser Gly 

225 230 235 240 

ggt ggt tct ggt ggc ggc tct gag ggt ggt ggc tct gag ggt ggc ggt 7 68 

Gly Gly Ser Gly Gly Gly Ser Glu Gly Gly Gly Ser Glu Gly Gly Gly 

245 250 255 

tct gag ggt ggc ggc tct gag gga ggc ggt tec ggt ggt ggc tct ggt 816 

Ser Glu Gly Gly Gly Ser Glu Gly Gly Gly Ser Gly Gly Gly Ser Gly 

260 ' 265 270 

tec ggt gat ttt gat tat gaa aag atg gca aac get aat aag ggg get 864 

Ser Gly Asp Phe Asp Tyr Glu Lys Met Ala Asn Ala Asn Lys Gly Ala 

275 280 285 



atg ace gaa aat gee gat gaa aac gcg eta cag tct gac get aaa ggc 
Met Thr Glu Asn Ala Asp Glu Asn Ala Leu Gin Ser Asp Ala Lys Gly 



290 295 300 



912 



aaa ctt gat tct gtc get act gat tac ggt get get ate gat ggt ttc 960 

Lys Leu Asp Ser Val Ala Thr Asp Tyr Gly Ala Ala He Asp Gly Phe 

305 310 315 320 

att ggt gac gtt tec ggc ctt get aat ggt aat ggt get act ggt gat 1008 

He Gly Asp Val Ser Gly Leu Ala Asn Gly Asn Gly Ala Thr Gly Asp 

325 330 335 

ttt get ggc tct aat tec caa atg get caa gtc ggt gac ggt gat aat 1056 

Phe Ala Gly Ser Asn Ser Gin Met Ala Gin Val Gly Asp Gly Asp Asn 

340 345 350 

tea cct tta atg aat aat ttc cgt caa tat tta cct tec etc cct caa 1104 

Ser Pro Leu Met Asn Asn Phe Arg Gin Tyr Leu Pro Ser Leu Pro Gin 

355 360 365 

teg gtt gaa tgt cgc cct ttt gtc ttt age get ggt aaa cca tat gaa 1152 

Ser Val Glu Cys Arg Pro Phe Val Phe Ser Ala Gly Lys Pro Tyr Glu 

370 375 380 

ttt tct att gat tgt gac aaa ata aac tta ttc cgt ggt gtc ttt gcg 1200 

Phe Ser He Asp Cys Asp Lys He Asn Leu Phe Arg Gly Val Phe Ala 

385 390 395 400 

ttt ctt tta tat gtt gee ace ttt atg tat gta ttt tct acg ttt get 1248 

Phe Leu Leu Tyr Val Ala Thr Phe Met Tyr Val Phe Ser Thr Phe Ala 
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405 410 415 

aac ata ctg cgt aat aag gag tct taa 1275 
Asn He Leu Arg Asn Lys Glu Ser 
420 



<210> 591 
<211> 424 
<212> PRT 

<213> Unknown Organism 
<220> 

<223> Description of Unknown Organism: M13 protein 
sequence 

<400> 591 

Met Lys Lys Leu Leu Phe Ala He Pro Leu Val Val Pro Phe Tyr Ser 
15 10 15 

His Ser Ala Glu Thr Val Glu Ser Cys Leu Ala Lys Pro His Thr Glu 
20 25 30 

Asn Ser Phe Thr Asn Val Trp Lys Asp Asp Lys Thr Leu Asp Arg Tyr 
35 40 45 

Ala Asn Tyr Glu Gly Cys Leu Trp Asn Ala Thr Gly Val Val Val Cys 
50 55 60 

Thr Gly Asp Glu Thr Gin Cys Tyr Gly Thr Trp Val Pro He Gly Leu 
65 70 75 80 

Ala He Pro Glu Asn Glu Gly Gly Gly Ser Glu Gly Gly Gly Ser Glu 
85 90 95 

Gly Gly Gly Ser Glu Gly Gly Gly Thr Lys Pro Pro Glu Tyr Gly Asp 
100 105 110 

Thr Pro He Pro Gly Tyr Thr Tyr lie Asn Pro Leu Asp Gly Thr Tyr 
115 120 125 

Pro Pro Gly Thr Glu Gin Asn Pro Ala Asn Pro Asn Pro Ser Leu Glu 
130 135 140 

Glu Ser Gin Pro Leu Asn Thr Phe Met Phe Gin Asn Asn Arg Phe Arg 
145 150 155 * 160 

Asn Arg Gin Gly Ala Leu Thr Val Tyr Thr Gly Thr Val Thr Gin Gly 
165 170 175 

Thr Asp Pro Val Lys Thr Tyr Tyr Gin Tyr Thr Pro Val Ser Ser Lys 
180 185 190 

Ala Met Tyr Asp Ala Tyr Trp Asn Gly Lys Phe Arg Asp Cys Ala Phe 
195 200 205 

His Ser Gly Phe Asn Glu Asp Pro Phe Val Cys Glu Tyr Gin Gly Gin 
210 215 220 
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Ser Ser Asp Leu Pro Gin Pro Pro Val Asn Ala Gly Gly Gly Ser Gly 
225 230 235 240 

Gly Gly Ser Gly Gly Gly Ser Glu Gly Gly Gly Ser Glu Gly Gly Gly_ 
245 250 255 

Ser Glu Gly Gly Gly Ser Glu Gly Gly Gly Ser Gly Gly Gly Ser Gly 
260 265 270 

Ser Gly Asp Phe Asp Tyr Glu Lys Met Ala Asn Ala Asn Lys Gly Ala 
275 280 285 

Met Thr Glu Asn Ala Asp Glu Asn Ala Leu Gin Ser Asp Ala Lys Gly 
290 295 300 

Lys Leu Asp Ser Val Ala Thr Asp Tyr Gly Ala Ala lie Asp Gly Phe 
305 310 315 320 

He Gly Asp Val Ser Gly Leu Ala Asn Gly Asn Gly Ala Thr Gly Asp 
325 330 335 

Phe Ala Gly Ser Asn Ser Gin Met Ala Gin Val Gly Asp Gly Asp Asn 
340 345 350 

Ser Pro Leu Met Asn Asn Phe Arg Gin Tyr Leu Pro Ser Leu Pro Gin 
355 360 365 

Ser Val Glu Cys Arg Pro Phe Val Phe Ser Ala Gly Lys Pro Tyr Glu 
370 375 380 

Phe Ser He Asp Cys Asp Lys He Asn Leu Phe Arg Gly Val Phe Ala 
385 390 395 400 

Phe Leu Leu Tyr Val Ala Thr Phe Met Tyr Val Phe Ser Thr Phe Ala 
405 410 415 



Asn He Leu Arg Asn Lys Glu Ser 
420 



<210> 592 
<211> 35 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 592 

caacgatgat cgtatggcgc atgctgccga gacag 35 



<210> 593 

<211> 1355 

<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: M13-III 
nucleotide sequence 

<220> — 

<221> CDS 

<222> (1)..(1305) 

<400> 593 

gcg gcc gca cat cat cat cac cat cac ggg gcc gca gaa caa aaa etc 48 

Ala Ala Ala His His His His His His Gly Ala Ala Glu Gin Lys Leu 

1 5 10 15 

ate tea gaa gag gat ctg aat ggg gcc gca tag get age gat ate aac 96 
lie Ser Glu Glu Asp Leu Asn Gly Ala Ala Ala Ser Asp lie Asn 

20 25 30 

gat gat cgt atg get tct act gey gar acw gty gaa wsy tgy ytr gem 14 4 
Asp Asp Arg Met Ala Ser Thr Ala Glu Thr Val Glu Ser Cys Leu Ala 
35 40 45 

aar ccy cay acw gar aat wsw tty acw aay gts tgg aar gay gay aar 192 
Lys Pro His Thr Glu Asn Ser Phe Thr Asn Val Trp Lys Asp Asp Lys 
50 55 60 

acy ytw gat cgw tay gey aay tay gar ggy tgy ytr tgg aat gey acm 240 
Thr Leu Asp Arg Tyr Ala Asn Tyr Glu Gly Cys Leu Trp Asn Ala Thr 
65 70 75 

ggc gty gtw gty tgy ack ggy gay gar acw car tgy tay ggy acr tgg 288 
Gly Val Val Val Cys Thr Gly Asp Glu Thr Gin Cys Tyr Gly Thr Trp 
80 85 90 95 

gtk cck atw ggs ytw gey atra cck gar aay gar ggy ggy ggy wsy gar 336 
Val Pro lie Gly Leu Ala He Pro Glu Asn Glu Gly Gly Gly Ser Glu 
100 105 HO 

ggy ggy ggy wsy gar ggy ggy ggw tcy gar ggw ggy ggw acy aar cck 384 
Gly Gly Gly Ser Glu Gly Gly Gly Ser Glu Gly Gly Gly Thr Lys Pro 
115 120 125 

cck gar tay ggy gay acw cck atw cck ggy tay acy tay aty aay cck 432 
Pro Glu Tyr Gly Asp Thr Pro He Pro Gly Tyr Thr Tyr He Asn Pro 
130 135 140 

ytm gay ggm acy tay cck cck ggy acy gar car aay ccy gey aay cck 480 
Leu Asp Gly Thr Tyr Pro Pro Gly Thr Glu Gin Asn Pro Ala Asn Pro 
145 " 150 155 

aay ccw wsy ytw gar gar wsy car cck ytw aay acy tty atg tty car 528 
Asn Pro Ser Leu Glu Glu Ser Gin Pro Leu Asn Thr Phe Met Phe Gin 
160 165 170 175 

aay aay mgk tty mgr aay mgk car ggk gew ytw acy gtk tay ack ggm 576 
Asn Asn Arg Phe Arg Asn Arg Gin Gly Ala Leu Thr Val Tyr Thr Gly 
180 " 185 190 
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acy gty acy car ggy acy gay ccy gty aar acy tay tay car tay acy 
Thr Val Thr Gin Gly Thr Asp Pro Val Lys Thr Tyr Tyr Gin Tyr Thr 
195 200 205 



gar tay car ggy car wsk wsy gay ytr cck car ccw cck gty aay gck 
Glu Tyr Gin Gly Gin Ser Ser Asp Leu Pro Gin Pro Pro Val Asn Ala 
240 245 250 255 



wsy gar ggw ggy ggy wsy ggr ggy ggy wsy ggy wsy ggy gay tty gay 
Ser Glu Gly Gly Gly Ser Gly Gly Gly Ser Gly Ser Gly Asp Phe Asp 
275 280 285 



ccw tty gty tty wsy gey ggy aar ccw tay gar tty wsy aty gay tgy 
Pro Phe Val Phe Ser Ala Gly Lys Pro Tyr Glu Phe Ser He Asp Cys 
385 390 395 



624 



cck gtm tcr wsw aar gey atg tay gay gey tay tgg aay ggy aar tty_ 672 
Pro Val Ser Ser Lys Ala Met Tyr Asp Ala Tyr Trp Asn Gly Lys Phe 
210 215 220 

mgw gay tgy gey tty cay wsy ggy tty aay gar gay ccw tty gty tgy 720 
Arg Asp Cys Ala Phe His Ser Gly Phe Asn Glu Asp Pro Phe Val Cys 
225 230 235 



768 



ggy ggy ggy wsy ggy ggw ggy wsy ggy ggy ggy wsy gar ggy ggw ggy 816 
Gly Gly Gly Ser Gly Gly Gly Ser Gly Gly Gly Ser Glu Gly Gly Gly 
260 265 270 



864 



1008 



tay gar aar atg gew aay gey aay aar ggs gey atg acy gar aay gey 912 
Tyr Glu Lys Met Ala Asn Ala Asn Lys Gly Ala Met Thr Glu Asn Ala 
290 295 300 

gay gar aay gcr ctr car wst gay gey aar ggy aar ytw gay wsy gtc 960 
Asp Glu Asn Ala Leu Gin Ser Asp Ala Lys Gly Lys Leu Asp Ser Val 
305 310 315 

gey acw gay tay ggt get gey ate gay ggy tty aty ggy gay gty wsy 
Ala Thr Asp Tyr Gly Ala Ala He Asp Gly Phe He Gly Asp Val Ser 
320 * 325 330 335 

ggy ctk get aay ggy aay ggw gey acy ggw gay tty gew ggy tck aat 1056 
Gly Leu Ala Asn Gly Asn Gly Ala Thr Gly Asp Phe Ala Gly Ser Asn 
340 345 350 

tcy car atg gey car gty ggw gay ggk gay aay wsw cck ytw atg aay 1104 
Ser Gin Met Ala Gin Val Gly Asp Gly Asp Asn Ser Pro Leu Met Asn 
355 360 365 

aay tty mgw car tay ytw cck tcy cty cck car wsk gty gar tgy cgy 1152 
Asn Phe Arg Gin Tyr Leu Pro Ser Leu Pro Gin Ser Val Glu Cys Arg 
370 375 380 



1200 



gay aar atm aay ytw ttc cgy ggy gty tty gck tty ytk yta tay gty 1248 

Asp Lys He Asn Leu Phe Arg Gly Val Phe Ala Phe Leu Leu Tyr Val 

400 405 410 415 

gey acy tty atg tay gtw tty wsy ack tty gey aay atw ytr cgy aay 1296 

Ala Thr Phe Met Tyr Val Phe Ser Thr Phe Ala Asn He Leu Arg Asn 
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420 425 430 

aar gar wsy tagtgatctc ctaggaagcc cgcctaatga gcgggctttt 1345 
Lys Glu Ser 

tttttctggt 1355 



<210> 594 
<211> 434 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: M13-III 
protein sequence 

<400> 594 

Ala Ala Ala His His His His His His Gly Ala Ala Glu Gin Lys Leu 
15 10 15 

lie Ser Glu Glu Asp Leu Asn Gly Ala Ala Ala Ser Asp lie Asn Asp 
20 25 30 

Asp Arg Met Ala Ser Thr Ala Glu Thr Val Glu Ser Cys Leu Ala Lys 
35 40 45 

Pro His Thr Glu Asn Ser Phe Thr Asn Val Trp Lys Asp Asp Lys Thr 
50 55 60 

Leu Asp Arg Tyr Ala Asn Tyr Glu Gly Cys Leu Trp Asn Ala Thr Gly 
65 70 75 80 

Val Val Val Cys Thr Gly Asp Glu Thr Gin Cys Tyr Gly Thr Trp Val 
85 90 95 

Pro lie Gly Leu Ala lie Pro Glu Asn Glu Gly Gly Gly Ser Glu Gly 
100 105 110 

•Gly Gly Ser Glu Gly Gly Gly Ser Glu Gly Gly Gly Thr Lys Pro Pro 
115 120 125 

Glu Tyr Gly Asp Thr Pro lie Pro Gly Tyr Thr Tyr lie Asn Pro Leu 
130 135 140 

Asp Gly Thr Tyr Pro Pro Gly Thr Glu Gin Asn Pro Ala Asn Pro Asn 
145 150 155 160 

Pro Ser Leu Glu Glu Ser Gin Pro Leu Asn Thr Phe Met Phe Gin Asn 
165 170 175 

Asn Arg Phe Arg Asn Arg Gin Gly Ala Leu Thr Val Tyr Thr Gly Thr 
180 185 190 

Val Thr Gin Gly Thr Asp Pro Val Lys Thr Tyr Tyr Gin Tyr Thr Pro 
195 200 205 
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Val Ser Ser Lys Ala Met Tyr Asp Ala Tyr Trp Asn Gly Lys Phe Arg 
210 215 220 

Asp Cys Ala Phe His Ser Gly Phe Asn Glu Asp Pro Phe Val Cys Glu 
225 230 235 . 240- 

Tyr Gin Gly Gin Ser Ser Asp Leu Pro Gin Pro Pro Val Asn Ala Gly 
245 250 255 

Gly Gly Ser Gly Gly Gly Ser Gly Gly Gly Ser Glu Gly Gly Gly Ser 
260 265 270 

Glu Gly Gly Gly Ser Gly Gly Gly Ser Gly Ser Gly Asp Phe Asp Tyr 
275 280 285 

Glu Lys Met Ala Asn Ala Asn Lys Gly Ala Met Thr Glu Asn Ala Asp 
290 295 300 

Glu Asn Ala Leu Gin Ser Asp Ala Lys Gly Lys Leu Asp Ser Val Ala 
305 310 315 320 

Thr Asp Tyr Gly Ala Ala He Asp Gly Phe He Gly Asp Val Ser Gly 
325 330 335 

Leu Ala Asn Gly Asn Gly Ala Thr Gly Asp Phe Ala Gly Ser Asn Ser 
340 345 350 

Gin Met Ala Gin Val Gly Asp Gly Asp Asn Ser Pro Leu Met Asn Asn 
355 360 365 

Phe Arg Gin Tyr Leu Pro Ser Leu Pro Gin Ser Val Glu Cys Arg Pro 
370 ~ 375 380 

Phe Val Phe Ser Ala Gly Lys Pro Tyr Glu Phe Ser He Asp Cys Asp 
385 390 395 400 

Lys He Asn Leu Phe Arg Gly Val Phe Ala Phe Leu Leu Tyr Val Ala 
405 " 410 415 

Thr Phe Met Tyr Val Phe Ser Thr Phe Ala Asn He Leu Arg Asn Lys 
420 425 430 

Glu Ser 



<210> 595 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 



<400> 595 

cgttgatatc gctagcctat gc 



22 
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<210> 596 
<211> 30 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 596 

gataggctta gctagcccgg agaacgaagg 

<210> 597 
<211> 37 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 597 

ctttcacagc ggtttcgcta gcgacccttt tgtctgc 



<210> 598 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 598 

ctttcacagc ggtttcgcta gcgacccttt tgtcagcgag taccagggtc 



<210> 599 
<211> 37 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 599 

gactgtctcg gcagcatgcg ccatacgatc atcgttg 



<210> 600 
<211> 37 
<212> DNA 

<213> Artificial Sequence 



<220> 
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<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> CDS — 
<222> (2)., (25) 

<400> 600 

c aac gat gat cgt atg gcg cat get gecgagacag tc 37 
Asn Asp Asp Arg Met Ala His Ala 
1 5 



<210> 601 
<211> 8 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
peptide 

<400> 601 

Asn Asp Asp Arg Met Ala His Ala 
1 5 



<210> 602 
<211> 37 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 602 

ctttcacagc ggtttgcatg cagacccttt tgtctgc 



<210> 603 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 603 

ctttcacagc ggtttgcatg cagacccttt tgtcagegag taccagggtc 50 



<210> 604 
<211> 7 
<212> PRT 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: Illustrative 
peptide 



Tyr Ala Asp Ser Val Lys Gly 
1 5 



<210> 605 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 
<400> 605 

cctcgacagc gaagtgcaca g 21 



<210> 606 
<211> 38 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 606 

ggctgagtca agacgctctg tgcacttcgc tgtcgagg 38 



<210> 607 
<211> 7 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Illustrative 
peptide 

<400> 607 

Gin Ser Ala Leu Thr Gin Pro 
1 5 



<210> 608 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 



<400> 608 

cctctgtcac agtgcacaag ac 



22 
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<210> 609 
<211> 42 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 609 

cctctgtcac agtgcacaag acatccagat gacccagtct cc 



<210> 610 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 610 

gggaggatgg agactgggtc gtctggatgt cttgtgcact gtgacagagg 



<210> 611 
<211> 11 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Illustrative 
peptide 

<400> 611 

Gin Asp He Gin Met Thr Gin Ser Pro Ser Ser 
1 5 10 



<210> 612 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 
<400> 612 

gactgggtgt agtgatctag 



<210> 613 
<211> 28 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 613 

ggtgtagtga tcttctagtg acaactct 



<210> 614 
<211> 6 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
peptide 

<400> 614 

Val Ser Ser Arg Asp Asn 
1 5 



<210> 615 
<211> 15 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 
<221> CDS 
<222> (1)..(15) 

<400> 615 

tac tat tgt gcg aaa 
Tyr Tyr Cys Ala Lys 
1 5 



<210> 616 
<211> 5 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
peptide 

<400> 616 

Tyr Tyr Cys Ala Lys 
1 5 



<210> 617 
<211> 36 
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<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
ol igonucleot ide 

<400> 617 

ggtgccgata ggcttgcatg caccggagaa cgaagg 



<210> 618 
<211> 95 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 618 

cgcttcacta agtctagaga caactctaag aatactctct acttgcagat gaacagctta 
agggctgagg acactgcagt ctactattgt acgag 



<210> 619 
<211> 10 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 



<220> 

<221> modif ied_base 
<222> (4) . . (7) 

<223> A, T, C, G, other or unknown 

<400> 619 
gatnnnnatc 



10 



<210> 620 
<211> 10 
<212> PRT 

<213> Unknown Organism 
<220> 

<223> Description of Unknown Organism: MALIA3-derived 
peptide 

<400> 620 

Met Lys Leu Leu Asn Val He Asn Phe Val 
1 5 10 



<210> 621 
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<211> 29 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: CJRA05-derived 
peptide 

<400> 621 

Met Ser Val Leu Val Tyr Ser Phe Ala Ser Phe Val Leu Gly Trp Cys 
15 10 15 

Leu Arg Ser Gly He Thr Tyr Phe Thr Arg Leu Met Glu 
20 25 



<210> 622 
<211> 15 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Illustrative 
nucleotide sequence 

<400> 622 
tttttttttt ttttt 



<210> 623 
<211> 87 
<212> PRT 

<213> Unknown Organism 
<220> 

<223> Description of Unknown Organism: MALIA3-derived 
peptide 

<400> 623 

Met He Lys Val Glu He Lys Pro Ser Gin Ala Gin Phe Thr Thr Arg 
15 10 15 

Ser Gly Val Ser Arg Gin Gly Lys Pro Tyr Ser Leu Asn Glu Gin Leu 
20 25 30 

Cys Tyr Val Asp Leu Gly Asn Glu Tyr Pro Val Leu Val Lys He Thr 
35 4 0 45 

Leu Asp Glu Gly Gin Pro Ala Tyr Ala Pro Gly Leu Tyr Thr Val His 
50 "* 55 60 

Leu Ser Ser Phe Lys Val Gly Gin Phe Gly Ser Leu Met He Asp Arg 
65 70 75 80 

Leu Arg Leu Val Pro Ala Lys 
85 
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<210> 624 
<211> 29 
<212> PRT 

<213> Unknown Organism 



<220> 

<223> Description of Unknown Organism: MALIA3 -de rived 
peptide 

<400> 624 

Met Ser Val Leu Val Tyr Ser Phe Ala Ser Phe Val Leu Gly Trp Cys 
x 5 10 15 

Leu Arg Ser Gly lie Thr Tyr Phe Thr Arg Leu Met Glu 
20 25 



<210> 625 
<211> 10 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<220> 

<221> modif ied_base 
<222> (7) . . (10) 

<223> A, T, C, G f other or unknown 

<400> 625 
ctcttcnnnn 



<210> 626 
<211> 87 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: CJRA05-derived 
peptide 

<400> 626 

Met He Lys Val Glu He Lys Pro Ser Gin Ala Gin Phe Thr Thr Arg 
X 5 10 15 

Ser Gly Val Ser Arg Gin Gly Lys Pro Tyr Ser Leu Asn Glu Gin Leu 
20 25 30 

Cys Tyr Val Asp Leu Gly Asn Glu Tyr Pro Val Leu Val Lys He Thr 
35 40 45 

Leu Asp Glu Gly Gin Pro Ala Tyr Ala Pro Gly Leu Tyr Thr Val His 
50 55 60 

Leu Ser Ser Phe Lys Val Gly Gin Phe Gly Ser Leu Met He Asp Arg 
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65 70 75 80 

Leu Arg Leu Val Pro Ala Lys 
85 



<210> 627 
<211> 10 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: CJRAOS-derived 
peptide 

<400> 627 

Met Lys Leu Leu Asn Val lie Asn Phe Val 
15 10 



<210> 628 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 628 

gacccagtct ccatcctcc 



<210> 629 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 629 

gactcagtct ccactctcc 



<210> 630 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 



<400> 630 

gacgcagtct ccaggcacc 



19 
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<210> 631 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 631 

gacgcagtct ccagccacc 



<210> 632 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 632 

gtctcctgga cagtcgatc 



<210> 633 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 633 

ggccttggga cagacagtc 



<210> 634 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

<400> 634 

gtctcctgga cagtcagtc 



<210> 635 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: Synthetic 
oligonucleotide 

. . A /• ■*» C 

ggccccaggg cagagggtc 
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